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Studies in Job Evaluation. I. Factor Analyses of Point 
Ratings for Hourly-Paid Jobs in Three Industrial Plants 


C. H. Lawshe, Jr. and G. A. Satter 


Division of Education and4 pplied Psychology, Purdue University 


While job analysis may serve many functions (8, 251), its primary pur- 
poses are generally accepted as (1) the derivation of training content, (2) 
the setting up of personnel specifications, (3) the improvement of job 
efficiency, and (4) the establishment of wage structures. Job evaluation, 
the general subject of these papers, is that area of job analysis which has 
as its function the establishment of equitable wage and salary rates of a 
non-incentive character. 

Types of Evaluation Systems. Many systems (4) of job evaluation 
have found acceptance in business and industry, but as has been pre- 
viously pointed out, the principal differences (2, 20) which obtain may be 
accounted for on two general bases: (1) the consideration of the job as a 
whole vs. the consideration of the job by parts or elements, and (2) the 
evaluation of each job against other jobs vs. the evaluation of each job 
against a previously prepared rating scale. 

For example, the simple ranking method measures job against job and 
considers each job as a whole. The classification method considers the 
job as a whole but measures it against previously determined grade 
standards. The factor comparison method (2) evaluates job against job 
but breaks the job into parts or elements. The job rating method con- 
siders the job by elements and measures each against a rating scale. The 
present paper is concerned with this latter method which is variously 
called job rating, point rating, and point evaluation. 

The NEMA System. One of the most widely used of these systems is 
one which Kress (6) adapted from the Western Electric Company’s pro- 
cedure for use by the National Electrical Manufacturer’s Association. 
This system was later used by the membership of the National Metal 
Trades Association. The scale provides for the rating of jobs on the 
following items: 
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Skill Responsibility 
Education For Equipment or Process 
Experience For Material or Product 
Initiative and Ingenuity For Safety of Others 
For Work of Others 
Effort 
Physical Demand Job Conditions 
Mental or Visual Demand Working Conditions 


Unavoidable Hazards 


Evaluation of the job on each of these items is achieved by means of five 
“‘degrees’’ or categories, each carrying a different point value. For ex- 
ample, the “experience’”’ element which “‘is the measure of time on related 
work usually required to enable the individual to perform the work satis- 
factorily and efficiently . . .” (5) is composed of the following degrees 
and point values: 


Degree Amount of Experience Points 
1 Up to three months 22 
2 Over three months up to one year 44 
3 Over one year up to three years 66 
4 Over three years up to five years 88 
5 Over five years 110 


Points awarded to each job on the “experience” item are combined 
with points awarded to the same job on each of the other ten item scales 
to obtain the total point rating of the job. Total points are then trans- 
muted to money values to establish the wage or salary structure. The 
plants included in this stud: utilize methods which stem from this basic 
plan, only minor modifications being employed. 

Purpose of the Study. The primary purpose of the present study is to 
analyze and describe statistically the job rating systems that are now 
functioning in three industrial plants. More specifically, an attempt was 
made to identify the basic factors operating in each of the systems, to 
determine those items which tended to cluster around the factors so 
defined, to determine the significance of each factor in the total point 
rating, and to examine similarities and differences between the plants. 


Data Source and Experimental Procedure 


Source of Data. Point rating data from three industrial plants were 
obtained. None of the plants has fewer than 5000 employees and the 
number of job classifications ranges from 250 to 300. 

Plant A—In reality this is several plants operated by the same company which 


manufactures aircraft engines exclusively. The jobs include a high 
proportion of machine operations requiring varying degrees of skill. 
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The rating system is a slight modification of the basic plan discussed 
above, the only significant variation being that “learning period” 
has been substituted for “experience.” 

Plant B—This plant is an airframe plant. The proportion of machine opera- 
tions is small while the proportion of riveting, assemblying, and other 
hand operations is large. The basic system of rating is used without 
modification. 


Table 1 


Intercorrelations of Point Ratings of Each of Eleven Items and of Total Points in the 
Job Evaluation Systems in Three Industrial Plants * 
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* The first value in each cell pertains to Plant A, the second to Plant B, and the 
third to Plant C. 
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Table 2 
Plant A: Factor Loadings Before and After Rotation 
Before Rotation After Rotation 
Rating Scale Items ky ke h ky ks h* 
(1) Total Points .971 .229 .995 .944 322 .995 
(2) Education 841 497 .954 .976 .026 .953 
(3) Learning Period 866 431 .936 .962 .095 .934 
(4) Initiative and Ingenuity .851 452 .929 .961 .069 .928 
(5) Physical Demand 143 —.659 455 — .229 .634 454 
(6) Mental or Visual Demand 705 507 .754 866 —.055 .753 
(7) Responsibility for Equipment .504 —.051 256 .400 311 .257 
(8) Responsibility for Material 458 438 402 .621 —.128 402 
(9) Responsibility for Safety .524 —.579 .610 — .136 -769 .610 
of Others 
(10) Responsibility for Work .859 .284 819 .878 .216 818 
of Others 
(11) Working Conditions 158 —.735 565 256 .706 564 
(12) Unavoidable Hazards 420 —.670 625 .000 .791 625 





Plant C—This plant manufactures small caliber ammunition. A large pro- 
rtion of the jobs consists of machine “attending” and visual 


inspection. 


The basic system with only very minor changes is used. 


Procedure. The job rating data, including ratings on each of the 
eleven items plus the total point rating for each job in the three plants, 
were punched on machine-sort cards. Data from each of the three 











Table 3 
Plant B: Factor Loadings Before and After Rotation 
Before Rotation After Rotation 
Rating Scale Items ki ke h? ki ke h? 
(1) Total Points .944 346 =: 11.010 .999 107 ~=—:1.010 
(2) Education 846 101 .726 845 —.107 .725 
(3) Experience 824 480 .909 915 .266 .908 
(4) Initiative and Ingenuity .880 .212 819 .904 —.007 817 
(5) Physical Demand — .274 579 410 — .125 627 409 
(6) Mental or Visual Demand .663 —.156 .464 605 —.312 464 
(7) Responsibility for Equipment .593 —.243 413 518 —.380 412 
(8) Responsibility for Material 543 074 .300 545 —.059 3200 
(9) Responsibility for 484 —.061 .238 455 —.176 .238 
Safety of Others 
(10) Responsibility for .540 127 .308 .555 —.099 .302 
Work of Others 
(11) Working Conditions — .236 .395 .212 —.133 440 211 
(12) Unavoidable Hazards .230 122 .068 .253 .062 .068 
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plants were treated separately and allintercorrelations between item ratings 
were computed. These correlations are presented in Table 1; the first 
figure in each cell representing Plant A, the second, Plant B, and the third, 
Plant C. The matrix for each plant was factor analyzed using Thurstone’s 
centroid method. Two factors were extracted from the matrices of Plants 
A and B and four from Plant C. The extraction processes were stopped 
at these points since Thurstone’s “‘phi’’ test had been satisfied and the 
communalities (h?) for the “total points” item in each case approximated 
unity. The centroid loadings of the factors in each of the three sets of 
data as they were derived from the analysis are presented in Tables 2, 3, 
and 4. These values were “transformed” by means of a procedure out- 
lined by Peters and VanVoorhis (7, 264-268). This procedure has the 
same objectives as the “rotation” procedure as it is used in the centroid 
method. The “transformed” or “rotated’”’ values are also presented in 
Tables 2, 3, and 4. 


Findings and Interpretation 


Factor Names. Factor I was found to be present in all three sets of 
data and accounts for most of the variance in total point ratings. Load- 
ings (see Table 5) for “‘experience”’ or “learning period” range from .869 
to .962, for “education” from .729 to .976, and for “initiative and in- 
genuity” from .867 to .961. Tables 2, 3, and 4 indicate high communali- 
ties on each of these items. Since each of these highly loaded items 
represents the requirements which the job imposes upon the individual 
who is to perform it successfully, this factor has been named “Skill 
Demands.” 

Factor IIA, present only in Plant A, has its heaviest loadings (see 
Table 5) in “unavoidable hazards,” “responsibility for the safety of 
others,” “working conditions,” and “physical demand.” All of these 
represent aspects of the job itself which the employee must contend with 
and for which, it is generally agreed, he should receive compensation. 
This factor has been named ‘‘Job Characteristics.” 

Factor IIBC is present in Plants B and C. Its heaviest loadings are 
found in “physical demand” and “working conditions.” It has seemed 
logical to name this factor “Job Characteristics—Non-Hazardous”’ since 
it apparently represents the non-hazardous aspects present in Factor ITA. 

Factor III, present only in the munitions plant (Plant C), represents 
the other portion of Factor IIA which is not included in Factor IIBC, 
namely “Job Characteristics—Hazardous.” Loadings are heaviest in 


“ynavoidable hazards’ and “responsibility for the safety of others.” 
These various combinations of the same items which form Factors IIA, 
IIBC, and III in the different plants have a rational explanation. In 
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Table 5 
Factor Names with Rating Scale Items Arranged in Order of Magnitude of Loadings 
Factor Plant A Plant B Plant C 
I. Skill (2) Education .976 | (3) Experience .915 | (3) Experience 869 
Demands (3) Learning Period .962 | (4) Initistive and (4) Initiative and 
(4) Initiative and Ingenuity 904 Ingenuity 867 
Ingenuity .961 | (2) Education 845 | (2) Education .729 
(10) Responsibility for (6) Mental or Visual (10) Responsibility for 
Work of Others 878 Demand 605 Work of Others 486 
(6) Mental or Visual (10) Responsibility for 
Demand 866 Work of Others 555 
(8) Responsibility for (8) Responsibility for 
Material 621 Material 545 
(7) Responsibility for 
(7) Responsibility for Equipment 518 
Equipment 400 | (9) Responsibility for 
Safety of Others 455 


(12) Unavoidable Hazards .253 





II. A Job Char- |(12) Unavoidable Hazards .791 
teristics (9) Responsibility for 

Safety of Others .769 

(11) Working Conditions 706 

(5) Physical Demand 634 











II. BC Job Char- (5) Physical Demand .627 | (5) Physical Demand 842 
acteristics, (11) Working Conditions .440 |(11) Working Conditions 492 
Non-Haz- 
ardous 

III. Job Charac- (12) Unavoidable Hazards .748 
teristics, (9) Responsibility for 
Hazardous Safety of Others .734 

(7) Responsibility for 
Equipment 452 

IV. Attention (6) Mental or Visual 

Demands Demands 565 
(8) Responsibility for 
Material 414 














Plant A, a more or less typical or average plant where the job hazards 
are neither excessive nor non-existent and where hazard and surroundings 
appear to fluctuate together, “unavoidable hazards’ and ‘‘responsibility 
for the safety of others” combine with “‘working conditions” and “physical 
demand” to make an all inclusive “Job Characteristics’ factor. In the 
case of Plant C where the hazards are great and are not in reality a func- 
tion of surroundings but of the material handled, these hazard items, being 
less related to the comfort aspects of the job, split off to make up Factor 
III, “Job Characteristics—Hazardous.’’ On the other hand, in Plant B 
where there is a minimum of moving machinery and a minimum oppor- 
tunity for personal harm in connection with the job, these hazard and 
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safety items disappear from the basic Factor IIA, leaving only “physical 
demand” and ‘“‘working conditions” heavily loaded with Factor IIBC, 
“Job Characteristics—Non-Hazardous.” 

Factor IV which is present only in Plant C has been named “Attention 
Demands.” As has been stated before, this particular plant has an 
abundance of visual inspection and machine “attending” jobs. Particu- 
larly in the latter type, failure to “‘attend”’ the machine adequately will 
result in material damage and material damage can scarcely be affected in 
any other way. This accounts for the loadings (see Table 5) in ‘“‘mental 
or visual demand” and “responsibility for material.” 

Item Significance. Tables 2, 3, and 4 each carries “total points” as 
the first item. It will be noted that the communalities (h?) are .995, 





Fig. 1. The relative proportions which each of the factors contributes to the total point 
ratings of jobs in each of three industrial plants. 


1.010, and .968, respectively for Plants A, B, and C. In other words, the 
factors identified (two in Plants A and B and four in Plant C) account for 
practically all of the variability in the total point ratings. If it can be 
assumed that these are the best possible rotations, then the relative pro- 
portion that each factor contributes to the total may be obtained from 
the squares of the loadings listed after total points. These proportions 
are presented graphically in Figure 1. In Plant A, 90% of the variability 
is accunted for by Factor I, “Skill Demands” and 10% by IIA, “Job 
Characteristics.” In Plant B, 99% is accounted for by Factor I and 1% 
by Factor IIBC, “Job Characteristics Non-Hazardous.” In Plant C, the 
proportions are Factor I, 77.5%; Factor IIBC, .5%; Factor III, “Job 
Characteristics—Hazardous,” 19%; and Factor IV, “Attention De- 
mands,’”’3%. These variations in the relative extent to which the various 
factors contribute to the total and consequently to the determination of 

















Studies in Job Evaluation 197 


the wage structure itself exist in spite of the fact that the various items 
appear to be weighted the same in all plants. 


Summary and Conclusions 


Job rating data from three different plants were subjected to Thur- 
stone’s factor analysis technique following the intercorrelations of points 
awarded on each of the items. Rotation was accomplished by means of 
Peters and VanVoorhis’ procedure. The following conclusions are 
supported: 

1. A common factor, named “Skill Demands” and representing at- 
tributes or characteristics possessed by the employee, is present in all 
three plants and accounts for most of the variance in “total point”’ ratings. 

2. A second factor, named “Job Characteristics,” and representing 
aspects of the job itself with which the employee must contend, accounts 
for the remaining variability in one of the plants. 

3. In one of the plants, the munitions plant, the items composing the 
“Job Characteristics” factor just mentioned, separate to make two fac- 
tors, ‘Job Characteristics—Hazardous” and “Job Characteristics—Non- 
Hazardous.” These two together with “Skill Demands” and another 
factor named “Attention Demands” account for all of the variability in 
total points in this plant. 

4. In the third plant, “Skill Demands” plus “Job Characteristics— 
Non-Hazardous” account for all of the variability. 

5. While there is considerable agreement from plant to plant in-so-far 
as the presence of factors is concerned, there is variation in the extent to 
which they contribute to “total point” ratings and consequently, to the 
existing wage structure. “Skill Demands,” for example, varies from 
77.5% in one plant to 99% in another. That this is true, is significant 
since all three plants use point rating scales in which the point allowances 
for the various items are relatively the same. It is clear that the extent 
to which each item or factor contributes to the total can not be determined 
by inspection of the scale alone and that the end result may yield results 
different from those intended by the makers of the scale. 

6. There is evidence that a point rating system should take into ac- 
count the inherent nature of the plant in order to achieve the optimum 
weightings of the primary factors. 


Received March 6, 1944. 
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The Development and Use of Apparatus Tests in Industry 


| Joseph E. Zerga 
War Manpower Commission, Division of Manpower Utilization, Los Angeles, California 


This article’ reviews recent industrial aptitude tests which utilize 
apparatus. Although tests that can be administered to a group are being 
used as far as possible by industry, some situations require a more compli- 
cated and individual procedure. Many such tests are described by Gar- 
rett and Schneck (17) and Bingham (7). Some more recent and extensive 
projects of various industrial organizations are described below. It is 
assumed that the reader is cognizant of the importance of validating such 
tests against a criterion. 

The Woodward Governor Company (27) had the Industrial Division 
of the Psychological Corporation of New York make a study of its plant 
jobs with the purpose of setting-up testing devices to aid in the selection 
of employees. Special testing machines were designed and built for a 
number of jobs, and eventually job tests were developed to measure the 
following abilities: blue print reading, hand dexterity, machine skill, 
mathematical ability, mechanical skill, mental alertness, observation, tech- 
nical judgment, technical ability, trade information, mechanical drawing, 
measurement ability, tool dexterity, personality, and interests. The suc- 
cess of these tests is indicated by Martin’s (27) statement that ‘‘as a 
measure for weeding out the untrainables our aptitude tests have worked 
out just about 85 per cent.” 

According to Oleen (31) of the Eagle Pencil Company of New York, 
the hiring of applicants without an adequate evaluation of their abilities 
results in 50 per cent of the employees developing into below average 
workers. The Eagle Pencil Company uses three performance tests for 
inspector applicants. The first test, primarily a meas":re of manipulative 
ability, requires the applicant to sort 150 colored metal tubes into six 
compartments, according to color. Included among the tubes are a num- 
ber of imperfects containing drilled holes which the applicant must place 
in a seventh separate compartment. The second test, primarily a meas- 
ure of perceptive ability, consists of 100 small aluminum spirals, 50 of 
which have a small hole correctly punched 3% turns from the end. The 
remainder have holes punched at varying distances from the standard 
point. The test consists in sorting the defective spirals from the perfect 
ones. A time and error record is kept for each subject. The third test, a 
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paper and pencil test designed to measure ability to judge small distances, 
requires the subject to check the off-center dots in 210 circles, there being 
one dot in each of the circles. 

Drake (12) describes a performance test that was devised for the oc- 
cupation Foot Press Operator in the textile industry, a job consisting of 
the following basic requirements: 1. The replacing of six different rolls of 
material in the machine as they became exhausted; 2. The ability to 
quickly detect and reduce jams in the machine; 3. The ability to antici- 
pate machine jams by detection of material defects; and, 4. The ability 
to start and stop the machine whenever necessary. 

The test apparatus, two by three feet in area, consisted of two flags, 
one red and one white, moving at different speeds around the outer edge 
of the apparatus. In some eleven turns, or a period just under three 
minutes, the red flag overtook the white flag. Located at six points 
around the course were contact areas two inches in length and identified 
by white strips. Both flags being on the same, or different, areas at the 
same time would cause a light to flash in the center of the apparatus 
board. Since there were six switches, each being connected to a contact 
area, the subject could prevent the light flashing on by pulling the switch 
that corresponded to the position of the red flag. Failure to pull the 
switch, pulling the switch too late, or pulling the wrong switch resulted in 
the flashing of the light and the scoring of an error. Inasmuch as the 
light would flash nineteen times per cycle if no switches were pulled, a 
perfect score would constitute nineteen correct pulls on the appropriate 
switches. All excessive pulls were scored as errors. Following a demon- 
stration and explanation of the apparatus each subject was given one or 
two practice trials, followed by two trials for which scores were recorded. 

A comparatively new trend in test development has been the increas- 
ing emphasis that is being placed upon the use of job analysis data as a 
starting point. Drake and Oleen (14), for example, advocate the use of 
job analysis techniques as the initial step in test construction, the analysis 
to take into consideration the following factors: ‘‘1. Length of the cycle; 
2. Nature of the elements of the cycle; 3. Sizes of materials or parts; 4. 
Serial order of elements of the cycle; 5. Three-dimensional positions of 
parts manipulated; 6. Incidence of finger, wrist, arm, and body move- 
ments; 7. Posture of the operator; 8. Visual, tactual, and kinesthetic 
attentive factors; and, 9. Speed and rhythm of work.”’ Drake’s recent 
publication (13) describes the design and industrial applications of special 
performance tests based upon job analysis and time and motion study data. 
Described in detail are tests designed to measure dual operation of the 
hands, hand-foot coordination, rhythm and speed, perceptual ability, etc. 

Three reasons for the relatively low validity of employment tests are 
offered by Taylor (39) of the Hawthorne Works, Western Electric Com- 
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pany, Chicago, Illinois: 1. Environmental factors which may influence the 
subject’s future behavior are intangible and unmeasurable; 2. Inferences 
must be drawn from the subject’s performance on tests as to his success on 
a given job; and, 3. A test or battery of tests only measures a relatively 
few of the multitude of factors comprising an individual. 

In conclusion, it may be stated that the complexity of a test of me- 
chanical ability or aptitude will, in general, increase in proportion to the 
complexity or number of the job elements which it is being designed to 
measure. For example, the basic skill elements required for successful 
performance on an unskilled repetitive job may be measured by such sim- 
ple tests as sorting, tapping, manipulating pegs, etc., whereas, the basic 
skill elements required for successful performance on a highly skilled non- 
repetitive job may have to be measured by a battery of tests, or a single 
complex test that has been specially designed for the job. 

The appended references will be of value to those who are concerned 
in following the development and use of apparatus tests in industry. 


Received July 5, 1943. 
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The Effect of Wearing Glasses upon Judgments of Personality 
Traits of Persons Seen Briefly 


G. R. Thornton 
Purdue University 


In an earlier experiment ! it was demonstrated that judgments of per- 
sonality traits from photographs are influenced by variation of a single 
factor in a photograph. Persons photographed wearing glasses were 
usually rated significantly higher in intelligence, dependability, indus- 
triousness, and honesty than were the same persons when photographed 
without glasses. The question arises as to whether similar results would 
be found in the judgment of individuals appearing before the judges in 
person. The experiment reported in this paper yields a partial answer to 
this question. 


Procedure 


The plan of the experiment was simple. Eleven subjects, 7 men and 
4 women, mostly graduate students, were fitted with suitable rimless 
glasses, made of plain clear glass.2, The students of two classes in begin- 
ning psychology, meeting on different days, acted as judges in the experi- 
ment. In each class the judges were divided randomly into two sub- 
groups which were then separated. In the first class all eleven subjects 
appeared first before one sub-group and then before the other sub-group; 
5 subjects wore glasses during their appearance before the first sub-group 
and did not wear them during their appearance before the second sub- 
group; the other 6 subjects appeared first without glasses and then with 
glasses. In the second class 10 of the subjects were used, 5 appearing first 
with glasses and 5 appearing first without glasses. 

The judges in each sub-group of both classes were asked to rate the 
subjects on six personality traits: kindliness, intelligence, industriousness, 
honesty in money matters, dependability, and sense of humor. Ratings 
were made by means of an eleven-point scale on which 0 meant “‘com- 
pletely lacking in the trait,” 5 meant “having as much of the trait as the 
average person,” and 10 meant “having the greatest possible degree of the 


1 Thornton, G. R. The effect upon judgments of personality traits of varying a 
single factor in a photograph. J. soc. Psychol., 1943, 18, 127-148. 

* These special glasses were furnished by Bausch & Lomb Optical Co. through the 
good offices of Dr. 8. E. Wirt. I am indebted to Dr. Wirt also for fitting the glasses and 
for helpful suggestions. 


203 





204 G. R. Thornton 


trait.’”” The judges were told nothing concerning the real purpose of the 
experiment, but only that it was an attempt to discover how accurately 
people can judge traits from visual cues. . 

The subjects were presented to the judges one at a time. Each sub- 
ject walked into the classroom, seated himself before the group facing the 
judges, and remained there for about two minutes while the judges re- 
corded their ratings. Beyond the directions to follow this simple routine, 
the subjects were giver no further instructions concerning their behavior. 

The subjects were unknown to most of the judges, but the judges were 
asked to indicate those subjects with whom they had “had any previous 
acquaintance.” The ratings by any judge on any subject who was recog- 
nized were eliminated before the tabulation of results. The average num- 
bers of judges from whom the ratings were usable for the various subjects 
were: 37 in the first class; 29 in the second class; and 63 in both classes 
combined (including one subject who because of illness appeared before 
only one class). These figures apply to each condition of wearing glasses 
or not wearing glasses; that is, in the first class 37 persons judged the 
subjects when wearing glasses and 37 other persons judged the same sub- 
jects when not wearing glasses. 

In both classes, following the judgments of the subjects who appeared 
in person, the judges were asked to make similar judgments of eight per- 


sons shown by means of lantern slides made from photographs. These 
slides were the same ones used in the earlier experiment referred to above. 
The slides shown to both sub-groups of each class were of the same persons 
and were as nearly identical as possible except for the fact that each person 
had been photographed once with glasses and once without glasses. Thus, 
a given subject who was shown to one sub-group wearing glasses was 
shown to the other sub-group without glasses. 


Results 


Since the behavior of the individual subjects, and in some instances 
their dress, varied in their successive appearances before the various 
groups of judges, it would be unreasonable to expect the effect of wearing 
glasses to show up clearly in the comparison of one individual with him- 
self. Results are presented, therefore, only for groups of subjects. 
These results are summarized in Table 1. 

The results of the judgments of the photographic slides are of enly 
incidental interest. In general, these results support the conclusions 
arrived at in the earlier more extensive study, although the differences 
between the means for wearing glasses and not wearing glasses are smaller 
than the corresponding differences in the earlier study. This decrease in 
the magnitude of the differences probably results principally from two 
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facts: (1) Fewer judges were used in the present experiment, making the 
average ratings somewhat less reliable. (2) The judgments of slides came 
after judgments of actual persons in the present experiment. 

It is of interest to note that, in general, the differences obtained for the 
slides are greater than the corresponding differences obtained for subjects 
appearing in person. There is a significant difference between these dif- 
ferences, at the 99 per cent level, only in the case of judgments for hon- 


Table i 


Differences in Scale Points in Mean Ratings on Six Traits Given to Persons or 
Photographs of Persons When Presented With and Without Glasses, and 
t Ratios of These Differences 





Intelli- Indus- Depend- 
gence | triousness Honesty ability 
Groups of Subjects 





Diff.| ¢ |Diff.) ¢ | Diff. Diff.| ¢ 





Photographic Slides .58 58 | 3. .49 | 2.39 





Before First Class 


Persons Appearing 
Before Second Class 


46 
Persons Appearing d rk 33 | 4. .28 | 2. 07) . -22 | 1.92 
22 


-28 | 2. ° ‘ 01} .06 





Results Combined for . 29 | 3. j J d J ‘ 1.82 
Persons Before Both 
Classes 





Results Treated Sepa- 
rately for Persons Be- 
fore Two Classes 















































1 The differences are mean differences in ratings given to the same persons or photo- 
graphs when shown with and without glasses. Positive differences indicate mean ratings 
higher for persons when wearing glasses; negative differences indicate mean ratings higher 
when not wearing glasses. 

* All ¢ ratios are based on standard errors of differences calculated directly from 
actual differences as described by Guilford, pp. 141-142 (Guilford, J. P. Funda- 
mental statistics in psychology and education. New York: McGraw-Hill, 1942). In 
calculating the SE of the mean of differences the conservative formula involving N — 1 
was used; N in each case was of course the number of subjects as listed in the next to 
last column. 


esty. Thet ratio of the difference for honesty is 3.2. The corresponding 
t ratios for intelligence, industriousness, and dependability are 1.1, 1.3, 
and 1.6. These ¢ ratios are for comparisons of the results from slides with 
the combined results for persons appearing before both classes as given in 
the next to last row in Table 1. Although only one of these differences 
between differences is statistically significant, the data suggest the hypoth- 
esis that wearing or not wearing glasses will have decreasing effect upon 
judgments of personality traits as the number of other cues upon which 
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judgments may be based increases. In line with this hypothesis is the 
fact that the judges were quite generally agreed that it was easier to judge 
real persons than photographic slides. When asked in an informal vote 
which were easier to judge, most judges voted for actual persons, and none 
voted for the slides. It should be noted, however, that both the subjects 
and the glasses were different in the two parts of the experiment. 

The results of principal interest are the differences in mean ratings ob- 
tained for subjects appearing in person. These differences, with their ¢ 
ratios, are listed in Table 1, first, for the 11 subjects as judged by the two 
sub-groups of the first class, second, for the 10 subjects as judged by the 
sub-groups of the second class, and third, for the 11 subjects as judged by 
the sub-groups of both classes combined. Finally, at the bottom of Table 
1, are presented the differences and ¢ ratios obtained when the 10 subjects 
appearing before the second class were treated as though they were 
additional subjects—making the total number of subjects 21. Although 
this last treatment of the data is of doubtful validity, there is some justi- 
fication for it because of the fact that the same subjects appearing on the 
second day were somewhat different in behavior and dress from what they 
were on the first day. Judging from the ¢ ratios obtained in these four 
different analyses of the data, one may conclude that significant group 
differences have been demonstrated for judgments of intelligence and 
industriousness and that differences for judgments of the other four traits 
are of doubtful significance.* 

The evidence in Table 1, therefore, may be interpreted as indicating 
that wearing glasses or not wearing glasses by subjects who are seen at a 
slight distance for about two minutes tends to affect the judgments made 
of their intelligence and industriousness, more favorable judgments being 
made for the subjects when wearing glasses. This statement applies, of 
course, only to average judgments for a group of subjects; individual dif- 
ferences might well be expected. The data are too limited to justify a 
statement concerning the effect of wearing glasses upon judgments of the 
other four traits rated in the experiment, except possibly in the case of 
honesty in money matters. Here the data suggest, contrary to the results 
obtained with photographs, that wearing or not wearing glasses by sub- 
jects appearing in person has little or no effect upon the average ratings of 
their honesty. 


3 Two of the ¢ ratios for judgments on kindliness and one t ratio for judgments on 
sense of humor are above the 95 per cent level of significance for the number of degrees of 
freedom involved. Further experiments might yield significant differences for these 
traits. This appears doubtful, however, because of the fact that the earlier experiment 
referred to previously indicated that ratings on kindliness and sense of humor were 
affected significantly by the smiling or unsmiling expression of the subject, and in this 
experiment the subjects’ expressions were uncontrolled. 
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Summary 


Groups of judges rated individuals presented one at a time in person 
and on photographic slides for six traits of personality. The subjects 
appeared before some groups wearing glasses and before other groups 
without glasses. The results obtained for subjects presented by means of 
slides indicate, in agreement with the findings of a more extensive earlier 
study, that wearing glasses tends to cause persons to be rated more intelli- 
gent, more industrious, more honest, and more dependable. The results 
obtained for subjects appearing in person indicate that wearing glasses 
tends to cause persons to be rated more intelligent and more industrious, 
but probably not more honest. The data suggest that wearing glasses has 
less effect upon judgments of individuals appearing in person than upon 
judgments of individuals presented by slides; a reliable difference in this 
latter respect, however, has been established only for one trait, honesty. 

All results presented apply only to groups of subjects. The conclu- 
sions are limited to the type of subjects used—young adults, the sorts of 
glasses worn—usually appropriate, and the type of judges employed— 
college students. 


Received May 11, 1948. 








I. Stroke-width, Form and Horizontal Spacing of Numerals as 
Determinants of the Threshold of Recognition * 


Curt Berger 
Cornell University 


The enormous growth of modern automotive traffic makes increasingly 
important the legibility of all enamel license plates of motor vehicles and 
traffic signs. The question of how to improve the legibility of traffic 
signs, with special emphasis on color and size, has been studied in all the 
important countries of the world, but with little or no general agreement 
as to which kind of symbols is optimally legible. 

If, under present conditions, a better legibility is required, generally 
the size of the numerals or the letters is increased. Sometimes, however, 
an increase of size is impossible, either because there is not enough space 
available or because it would appear ugly. Furthermore, it is not neces- 
sarily true that the legibility of symbols increases by increasing the area of 
a number or letter. Many big signs are less legible than others presenting 
smaller but differently constructed symbols. 

The practical interest in this problem has produced a number of 
interesting contributions to the factors determining the legibility of sym- 
bols on highways and printed material for reading purposes. These in- 
vestigations are mostly concerned with the absolute size of the symbols 
and general conditions of illumination, using various criteria for deter- 
mining their legibility." 

K. Dunlap? analyzed 122 different license-plates with the purpose of 
improving legibility and efficiency. The results of this investigation can 
briefly be summarized as follows: (a) A light background with dark num- 
bers gave best results; (b) Plates without borders seemed best; (c) Num- 
erals spaced farther apart gave highest legibility; (d) Plates in which the 
numbers did not exceed 25% of the total area of the plate were most 

* Report of experiments leading to a newly designed and patented system of 
automobile license plates. 

1 Luckiesh, M., and Moss, F. K. The science of seeing. New York: Van Nostrand 
Co., 1937. Idem. Quick and certain seeing on streets and highways. Proc. Inst. Traff. 
Eng., 1938, 55. Idem. The visibility and readibility of printed matter. This Journal, 
1939, 23, 645. Tinker, M.A. Effect of visual adaptation upon intensity of light pre- 
ferred for reading. Amer. J. Psychol., 1941, 54, 559-563. 


* Dunlap, K. Report Highway Res. Bd., Nat. Res. Council, Division Office, 1932, 
App. E., p. 3, article 4. 
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efficient; (e) Numerals with slender stroke were more efficient; and (f) 
The contributing factors in order of merit were: Height-width ratio of 
letters, legend-background ratio, stroke-width of numerals, spacing of 
numerals, width-stroke ratio of letters, wave-length difference of legend 
and background, and number of single items on plate. 

This investigation, as well as many other studies (a few of them 
mentioned above), suggests a wide practical interest in the problem. But 
it has also theoretical implications which deserve attention. It is usually 
difficult, if not impossible, to draw conclusions from experiments made 
under refined laboratory-conditions and to apply them to complex every- 
day situations which rarely are sufficiently constant to permit reproduci- 
ble results. Yet if a well-defined problem presents itself, it seems gratify- 
ing to try. 

From a previous analysis of factors involved in measurements of visual 
acuity * three basic functions of the human eye can be named, which, in 
addition to various other factors, must play a decisive role for the recog- 
nizability of numerals on highways under daylight as well as under night- 
time conditions. These functions are the power of the eye to resolve 
details (resolving power), the ability to discriminate brightness differences 
(brightness discrimination), and the influence of form (form-visibility). 

Experiments about the dependence of visual resolution upon distance 
from the eye ‘ have shown that the minimal visual angle increases with 
distance if black or white symbols are used with reflected light. This 
minimal angle is independent, however, of the distance of the eye when 
very small luminous squares or points are used on a dark background and 
when the eye is adapted to a medium light-intensity.6 From these results 
it can be concluded that, all other factors being constant, distances be- 
tween details of a numeral are less important for white and for black 
symbols under daylight conditions and with reflected light than for self- 
luminous symbols under ordinary night-conditions. In other words, 
brightness discrimination is the main factor in determining optimal dis- 
tance between details of numerals, black on a white background, or vice 
versa, with reflected light, while, with luminous numerals, the best results 
should be obtained with an extremely slender stroke and a low intensity,® 


* Berger,C. The dependency of visual acuity on illumination and its relation to the 
size and function of the retinal units. Amer. J. Psychol., 1941, 54, 336-352. 

‘Idem. Untersuchungen zur Methodik von Bestimmungen der Unterschieds- 
empfindlichkeit des emmetropen Auges. Skand. Arch. f. Physiol., 1935, 71, 173-199. 

5Idem. Weitere Untersuchungen iiber die Unterschiedsempfindlichkeit (Auflés- 
ungsvermdgen) des emmetropen Auges. Ibid., 1936, 74, 27-62. 

* Berger, C., and Buchthal, F. Der Einfluss von Beleuchtung und Ausdehnung des 
gereizten Netzhautareales sowie vom Pupillendurchmesser auf das Auflésungsvermégen 
des emmetropen Auges. Jbid., 1938, 78, 197-219. 
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which leave a maximal space between details; with black and for white 
numerals, with reflected light, a greater stroke-width will give optimal 
visibility. 

It was also found in these investigations that small black squares, even 
with optimal brightness disappear very soon, if the distance from the eye 
is increased ; whereas white or luminous squares of the same size on a black 
background are still visible at a distance five times as great or more if their 
intensity is strong enough. This would mean that black numerals would 
require a wider stroke, white and luminous numerals a more slender stroke 
for optimal recognizability. 

The factor of form has a particular theoretical implication. It is one 
of the fundamental assumptions of Gestalt theory ? that form is a typical 
example of a factor in perception which has no relation to the mosaic 
structure of the retina or an analogous characteristic of the central nervous 
system. According to configurationists, form can be explained only by 
psychophysiological processes in the central nervous system comprising 
larger fields whose configurational characteristics determine particular 
form-effects in perception. Helson and Fehrer * have also tested some of 
these assumptions, especially as to whether the circle or another figure 
plays a predominant role. Using a great number of criteria, they found 
that the triangle, not the circle, “appeared” most frequently, while, with 
other criteria, other figures were best. All in all, their investigation 
seems to show that form as such plays a relatively small role in perception. 
In an investigation about form-visibility and the function of the fovea, 
Berger and Buchthal ® found that there is a direct relationship between 
retinal structure and form-visibility. The more complicated a form 
covering a constant area, the larger has to be the retinal image for black 
forms, as well as for luminous, in order to be recognized, that is more 
retinal units are required for their recognition. 

For the form-influence upon the recognizability of numerals we thus 
have three possibilities: (1) If configurational assumptions are to be 
applied, the numeral “0” (which is most like the circle) should be best 
recognized, and since each numeral represents a definite but widely dif- 
fering configuration, optimal form-characteristics should be different for 
each symbol if these are determined by configuration; (2) If form plays 
no great role in recognizing numerals, the legibility of a numeral of equal 
area and stroke-width should be little or not at all affected by ferm- 


7 Koffka, K. Psychologie der optischen Wahrnehmung. Bethe’s Handb. norm. u. 
pathol. Physiol., 1931, 12, 1215-1271. 

8 Helson, H., and Fehrer, E.V. The role of form in perception. Amer. J. Psychol., 
1932, 44, 79-102. 

® Berger, C., and Buchthal, F. Formwahrnehmung und Funktion der Fovea. 
Skand. Arch. f. Physiol., 1938, 79, 15-26. 
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variations of its details; (3) If the decisive form-factor in perception is 
related to the functional and structural characteristics of the retina or 
corresponding central fields, we can expect a direct influence of form, 
which should be similar for all numerals, disregarding their particular 
configuration.'° 

It may be said in summary that the following factors determine the 
recognizability of symbols and numerals on the streets: (1) The absolute 
size of the area covered by the symbol; (2) The use of white symbols on a 
black background or of black symbols on a white; (3) The difference of 
wave-length (color) of the light between symbols and background; (4) The 
width of stroke of the symbols, i.e. the ratio of stroke-width to width of 
symbol in a constant area; (5) The form of the symbol; (6) Spacing 
between the symbols; (7) The border or immediate surroundings of the 
symbols; (8) The use of reflected light or of luminous symbols (under 
night-conditions) ; and (9) The number of symbols on the plate. 

The present investigation has been carried out to determine all of these 
factors for a given constant area, with the exception of the influence of 
wave-length, a study which has to be carried out separately. The experi- 
ments with black symbols have been limited to the finding of optimal 
stroke-width, whereas those with white symbols on a black background 
have been carried out entirely. Furthermore, a ‘‘standard” legibility has 
been chosen, to which all other data determining the legibility of the 
numerals have been adjusted. The experiments to be described lead to 
the construction of 9 numerals, white-on-a-black-background, which, on a 
chosen constant area, are optimally and equally visible. The construc- 
tion of these numerals converts the particular space selected into a mini- 
mal area on which the numerals are optimally visible. Special experi- 
ments are described for testing night-conditions. Finally our results will 
be compared to those of Dunlap to which reference has already been made. 


Procedure 


The resolving power of the eye, the sensitivity to differences of light- 
intensities and the visibility of forms, are different functions of the human 
eye. These different functions, under ordinary life conditions, influence 
each other and depend in different ways upon such factors as light- 
intensities, size of retinal image and adaptation. Therefore, these func- 


10Summaries of the vast literature concerning visual problems are contained in 
Guillery, H. Sehschirfe. Bethe’s Handb. norm. u. pathol. Physiol., 1931, 12, 745-808; 
Troland, L. T. An analysis of the literature concerning the dependency of visual func- 
tions upon illumination intensity. Trans. Illum. Eng. Soc., 1931, 26, 2-107; Bartley, 
S. H. Vision. New York: Van Nostrand Co., 1941, and Polyak, S. L. The retina. 
Chicago: The Univ. of Chicago Press, 1941. 
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tions cannot be distinctly differentiated all at once in experiments for the 
improvement of the legibility of symbols used under everyday conditions. 

But experiments can be made in such a way that at first some factors 
(e.g., the influence of forms) are neglected and the optimal stroke-width 
of lines is spotted. For symbols of different forms, different stroke-widths 
for optimal legibility will be found, from which an average value can be 
calculated." This average stroke-width can then be taken as a basis for 
other experiments, dealing with the improvement of the form of each 
symbol. Such is the procedure described in this investigation. 

Finally, experiments have been made to adjust the legibility of all the 
numbers and the spaces between them to each other, that is to say, to one 
equal or “standard” legibility. The numbers now used over the entire 
world as well as those found by the above described procedure differ in 
their legibility from each other. Thus, if more than one number at a time 
is used, one of the signs or numerals in a group may still be visible, 
when their distance from the eye is increased, while the rest already have 
diffused into an indistinguishable pattern. The ideal claim for the nu- 
merals used in practice must be that they all have equal legibility, and 
therefore disappear at the same time, that is to say, at the same distance 
from the eye.” 

One solution of this problem is that one uses one number, for example 
the number eight (because of its harmonic form), as a “norm-number’” and 
constructs the other numerals and the spaces between them in such a way 
that they all appear and disappear at the same time, that is the same 
distance from the eye as the “‘norm-number” eight. It is best, both from 
the esthetic point of view and for space-economy to use the same height 
for all numbers, as well as the same width of stroke.“ The construction 
of equally legible numerals on a minimum space can then be made by 
adjusting the inner distances of the numbers until their legibility is equal 
to that of the standard number eight. By inner distance the horizontal 
distance between the inner borders of the vertical strokes of each number 


11 Keeping the area of all symbols constant, each number will require a special 
stroke-width for optimal legibility. That would not mean much improvement, for each 
number would then be legible at different distances and its esthetic appearance would be 
odd. The final solution must be to use numbers of equal legibility, which, at the same 
time, gives rise to optimal legibility for the particular area investigated. 

Tt may be mentioned that numbers, no matter of what kind, will have a limiting 
distance at which they disappear, which is different for different individuals. These 
differences are due to individual differences of the visual functions and cannot be over- 
come. But the numbers found in this investigation will have optimal average legibility 
for all normal subjects. 

13 If only one row of numerals is used, their length could be chosen differently, espe- 
cially for number 1. But number 1 is best legible anyhow, and the use of a frame or 
second row makes it necessary to keep the height of all numerals equal. 
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ismeant. The outer distances which are the horizontal distances between 
the outer borders of the vertical strokes of two adjacent numerals, should 
be determined in the same way, namely experimentally as follows: The 
inner distance of one numeral or the outer distance between two numerals 
is varied in the range corresponding most nearly to the legibility of the 
standard number eight, and the threshold of legibility is determined 8 
times for each particular inner or outer distance. Before and after these 
series of experiments the threshold of legibility of the standard number 
eight is determined (this is repeated for each series because of possible 
slight differences between each series), and from the results two curves 
can be drawn. Their point of intersection shows the exact horizontal 
inner and outer distances required for equal legibility of the numeral or 
the distance between two numbers on the basis of the standard number 
eight. Carrying this procedure through with each number, each outer 
and inner distance, 9 numerals of equal legibility can be constructed. 
Recapitulating, it can be said that the experiments described in this 
treatise, lead to the construction of 9 different numbers: 1, 2, 3, 4, 5, 6, 7, 8 
and 0 (the 9 being equal to the 6 reversed), white on a black background 
for day vision and luminous on a black background for night vision. All 
the numbers have the same height of 80 mm, the same stroke-width (one 
for day vision, another for night vision), different forms, different inner 


and outer distances, adjusted in such a way that all numbers, single or in a 
constellation, become recognizable at the same distance from the eye, 
and also diffuse simultaneously at a distance which is the longest possible 
distance for the particular area used. 


The area covered by the “standard-number” eight used for the experiments 
described in this article was 42 mm X 80 mm height. This is the height most 
frequently used in European countries and on the American continent. The 
width of this area was reduced by 6 mm from the width used at present in 
Denmark (48 mm). Such a reduction was advisable in consequence of some 
preliminary experiments, which showed that the stroke width used in Denmark 
(16 mm) is far too great. 

The choice of this particular area for the standard-eight although decisive 
for the absolute thresholds of number legibility finally obtained, was taken only 
in order to get some definite starting point for the investigation and also in 
order to compare the new results with some definite numbers, already used. 
The Danish numbers, before the war, had been constructed by a well known 
Danish ophthalmologist and seemed therefore to deserve to be taken as com- 

arable norm, since the well known principle of Snellen was used. (Compare 
ig. 13, Part II.) 

The results presented in this article are restricted to the particular dimen- 
sions outlined above and cannot directly be converted into new dimensions for 
other areas, larger or smaller. For the construction of a whole set of numerals, 
regardless of size, width and height, the main experiments described in this 
article should be repeated with areas of different sizes in steps which allow for 
drawing curves and integrating over whatever area is used. 

Until such a complete investigation is made, one will not be much mistaken 
if he calculates the dimensions of numbers, covering another absolute area, 
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prenentionsteny to the dimensions found in the present investigation. If the 
eight is the same, but the width 10% greater, one might simply add 10% to all 
the horizontal figures, found and indicated in this study. If the width is the 
same, but the height 10% more, simply add 10% to all vertical data. If the 
width is 15% less, and the height 10% less, proceed in the same way; namely 
subtract 15% from all horizontal values, and subtract 10% from all vertical data 
found in this investigation. 


Method 


To determine the legibility of a number the threshold was found by 
increasing the distance between the test object and the eye of an observer; 
if the legibility of a number was tested, the observer was required to watch 
the number; if a distance between two numerals was checked, the subject 
was asked to watch only that particular distance, disregarding the legi- 
bility of adjacent numbers. The threshold of legibility or recognition 
was determined by increasing or decreasing the distance between the 
subject and test object until the particular characteristic observed could 
not be recognized any more or could just be recognized again. The sub- 
ject stood upright on his feet upon a lawn of 100 m length. The numerals 
for day vision were made of white paper and pasted on black “ cardboard 
(or vice versa) on a wooden frame, and were moved slowly towards the 
subject and away again. The numerals for night vision were cut out of 
cardboard and pasted upon a sheet of opal-glass, illuminated from behind 
by some bulbs (about 1 inch from the opal-glass) in a portable box with 
battery. If the subject could just recognize the number or in other 
experiments the space between two numbers or could just not recognize 
them any longer, he raised his hand for a moment. At this point the 
distances were checked each time with a ribbon of about 30 m length 
lying on the lawn. 

Number of experiments. The appearance and disappearance of a numeral or 
a distance between two numbers was always measured four times in direct 
sequence. Every point in the curves and every number in the tables mentioned in 
this article are therefore average values of 8 single experiments, unless otherwise 
indicated. The mean deviations for single series were between 5 and 12%, 
exceptionally up to 15%. 

Subjects. Some subjects were accustomed to psychophysiological experi- 
ments, others were absolutely inexperienced. Thus differences have been found 
with respect to the accuracy of the experiments (mean deviations), but there 
have not been consistent differences of the forms of the curves or relations 
investigated between different subjects. Generally four subjects were used, in 
more important experiments more. The subjects were all emmetropic. The 
absolute values of various subjects differed as in all biological experiments, but 
in view of the good agreement in the form of curves and relationships investi- 


gated, it seems doubtful, whether the results would be greatly changed or im- 
proved by using a very large number of observers. 


4 Special attention was given to the choice of white paper or cardboard of highest 
reflecting quality without glaring effect (dull) and black paper and cardboard of highest 
possible absorption (dull). 
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Experiments to compare different conditions. It is impossible directly to 
compare experiments made on different days under different conditions, because 
there are threshold differences between one day and another for the same sub- 
jects even under equal conditions. Furthermore, most experiments were made 
with daylight and although the experiments were made only when the sun was 
covered and at the same time in the morning, differences in brightness from day 
to day could not be avoided completely.* Experiments which were made to 
compare the legibility under different conditions were therefore always made 
on the same hour of the same day, and it was observed that the light did not 
change noticeably during the experiments. Furthermore the use of the stand- 
ard number eight at the beginning and at the end of most series of experiments 
made it possible to control such changes, if they could not be avoided entirely. 

Night-experiments. The night experiments were not made in complete 
darkness, but under conditions corresponding to average street-light. On one 
side of the lawn at a distance of about 10-15 m from the subject were a few big 
lighted street-lamps. The light was not permitted to shine directly on the 
numerals or into the eyes of the subjects. 


Results: A. Daylight Experiments 


(1) Determination of optimal stroke-width for white and black numbers. 
The first experiments were made with two observers and three numbers: 
8, 5 and 2, which differ most, but are at the same time most harmonic. 
Strictly these experiments should have been made with all 9 numbers. 
But as can easily be understood, numbers with no divisional structure, as 
the number 1, or numbers with very little divisional structure as the 
numbers 0, 7 or 6 are not well fit to find a good average stroke-width. 
The best numbers for this purpose seemed to be those most uniformly 
structured, namely the three mentioned above. It must also be remem- 
bered, that these experiments are intended to find a good average stroke- 
width for all 9 numbers, because as mentioned above, what will be seem- 
ingly lost for the legibility of one particular number by choosing only one 
stroke-width for all numbers, will be regained later on by adjusting all 
numbers to one equal or “standard’’—legibility. Therefore, in spite of 
the fact that strictly and scientifically speaking, no such thing as an 
“average optimal stroke-width” can in itself solve the problem of optimal 
legibility for all the numerals, this average stroke-width for the three 
numbers mentioned above is preparing the ground for the further adjust- 
ment of all numbers to standard visibility or legibility and therefore is 
practically justified by the subsequent procedure. 

It is doubtful whether similar experiments with more numbers would 
lead to a noticeably different “average stroke-width,’”’ unless the number 
one is used too. But the number one, represented only by a vertical 
stroke, has obviously to be considered as an exception, since it has no 
differentiating structure and will improve its recognizability continuously 


% Compare with Harrison, W. and Luckiesh, M. Comfortable lighting. Jllum. 
Engng., 1941, 36, 1110-1111. 








Fic. 1. The three numerals &, 5 and 2, reduced to about 1/10th of the area. The 
very slender strokes indicate the minimum, the wide strokes the maximum stroke-width 
investigated. Six steps were used between these limits. 


until a square of 80 mm X 80 mmis reached. Even afterwards, changing 
into a rectangle of 8 cm height, it still would increase its threshold by 
increasing its width, making this number a definite exception, for which 
no special width experiment can be considered. 

It can also be concluded from this consideration, and this is confirmed 
by the subsequently described experiments, that the least structured 
number, 1, is the most legible number, while the most structured numbers 
as the numbers 4 and 8, are the least legible numbers for a given area and 
stroke-width. 

For a direct comparison between white and black numbers, both kinds 


Table 1 


Experiments about the influence of stroke-width upon the threshold legibility of white 
and black numerals on an area of 42 mm X 80 mm, with daylight. Two observers 
were used and three numerals: 8,5 and 2. Each number represents an average 
of 8 single experiments, the last column an average of 48 single experiments. 
(Compare Figure 2) 
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were used alternately in the same series of experiments, while the stroke- 
width was increased from 2 mm to 16 mm with a constant outer area: 
42 X 80mm. Figure 1 shows the three numbers, reduced to about 1/10th 
of the area, used during the experiments. 

In Figure 1 only the two extreme widths of stroke, 2 mm and 16 mm, 
are shown. The results are represented in curves of Figure 2 and in 
Table 1. 

Table 1 shows the thresholds of recognition for the numerals 8, 5, and 
2, white on a black background and black on a white background for two 
subjects separately and the average of all series with numerals of equal 
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Fic. 2. Influence of stroke-width between 2 and 16 mm upon the recognizability of 
3 numerals (8, 5 and 2) on the constant area: 42 mm X 80mm. 2 observers. Curve A 
= white numerals on a black background, Curve B = black numerals on a white back- 
ground. (Compare Table I.) 


stroke-width. The thresholds of recognition are the average of 4 deter- 
minations of the recognizability of each numeral and 4 determinations of 
their illegibility, expressed in distances (m) from the eye of the S8’s. 
Figure 2 shows only the average of all series (3 numerals and 2 8’s- 
6 series) with numerals of equal stroke-width. The optimal stroke-width 
differs somewhat for different numbers and subjects, as expected. For 
the white numbers the 8 tends in some cases to have a better threshold 
with a more slender stroke-width than the average; the number 5 tends 
towards a better legibility if a wider stroke than the average is used. For 
the black numbers there is a slight tendency for number 2 with one sub- 
ject to be most visible or legible if a wider stroke is used than the average. 

The average values of all numbers and subjects show the following 
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result: The optimal average stroke-width for white numbers on a black back- 
ground with reflected daylight on an area 42 mm X 80 mm, is 6 mm, or in 
other words is in the proportion of 1:5 for the stroke-width and the 
horizontal distance between the inner borders of the vertical boundaries. 

The optimal average stroke-width for black numbers on a white background 
is 10 mm on an area 42 mm X 80 mm, or a proportion as mentioned above 
of 1 : 2.2, with reflected daylight. 

The average optimal recognizability of white numbers is at 36.5 m for 
the area mentioned; while the optimal average threshold of legibility for 
black numbers of the same area is at 33.5 m. It can therefore be con- 
cluded: White single numerals with optimal stroke-width (6 mm) can, on 
the average, be recognized 8.8% better than black numerals on a white back- 
ground with optimal stroke-width (= 10 mm), if the same area of 42 mm 
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Fia. 3. Influence of stroke-width upon recognizability of white numerals on a black 
background. The curve represents the average of 2 observers and 3 numerals (8, 5 and 
2), and is a repetition of experiments, represented in Figure 2, Curve A, but in greater 
detail. 


























X 80 mm is used. This conclusion should not be applied to groups, con- 
sisting of more than one single number, because then the distances be- 
tween the numbers must be considered separately. 

(2) Repetition of experiments for finding optimal stroke-width for white 
numerals only. The first series of experiments was made not only to find 
the optimal stroke-width of the numerals, but also to compare recogniz- 
ability of white and black numbers. To check upon these results, and 
possibly obtain a more exact result, white numbers only were used in new 
series, in which the stroke-width was changed from 1 to 17 mm in one mm 
steps, using the numbers 8, 5 and 2 as in previous experiments (compare 
Figure 1). The results, represented in Figure 3, which was found by the 
same procedure as Figure 2, show again an average maximum of 
recognizability at a stroke-width of 6 mm, while in spite of a deviation at 
9 mm, the curve is very much like the previous curve for white numbers. 
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This shows that in spite of the somewhat unstable daylight conditions the 
tendency of the results with repetition series is very much the same, 
although the absolute distances for threshold legibility are somewhat less. 

(3) Determination of dimensions and form for ‘‘standard.”’ One of the 
main difficulties in improving the recognizability of symbols is the problem 
of defining just what area and what number should be chosen as a basis. 
This also seems to have been one reason why results have been taken for 
correct which are directly contradictory. Thus, Germany has considered 
black symbols on a white background as more legible while England, 
Denmark, Argentina and others use white numerals on a dark background. 
By taking ready-made numbers, as Dunlap did, certain vague conclusions 
can be obtained, but they are determined by the particular material used, 
unless the variables which determine the legibility are changed systemati- 
cally one after another, keeping the rest of the variables constant while 
one is changed. Thus it can be shown from Table 1 and Figure 2, that if 
black and white numbers are compared which have a width of stroke of 
14 mm on an area 42 mm X 80 mm, the black numbers will have a 
greater average recognizability than the white, while the opposite is true 
for the same area, if the stroke-width is taken as 6 mm for both combina- 
tions. The results will be still more confused, if only one particular 
number is chosen or five number groups. Then the contradictions will 
mount according to the form differences of the number chosen or according 
to their form and distance apart. These have been some of the reasons 
why the task of finding ‘optimally recognizable numbers” seemed un- 
solvable. 

Our-results show that the problem can be solved by systematic investi- 
gations. The averaging of results obtained with three different numbers 
and two different observers, disregarding the particular form effect of each 
number, seems at first very crude and, strictly speaking, unscientific. 
One is likely to believe that such a procedure can only be considered as a 
theoretical abstraction, while the actual numbers still will show unavoid- 
able threshold differences. This would be true, if the investigation was 
interrupted and given up at this point; such a criticism, however, will be 
shown faulty or at least exaggerated, by the following experiments and 
adjustments. 

The main difficulty to overcome at this point of the investigation is to 
create a valuable “‘standard’’ number, to which the recognizability of all 
other factors can be related. The preliminary step for this choice was the 
finding of a suitable width of stroke, which has been determined as 6 mm 
for white numerals on the area 42 mm X 80 mm. _In this connection it 
might be well to realize that the crudeness of this determination is by no 
means detrimental for the following procedure. It is probably without 
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major importance, whether 6, 7 or 8 mm would have been chosen. The 
differences in recognizability in this range are practically negligible. But 
it was important to find out, whether a very slender, a very wide, or a 
medium stroke-width should be chosen, in order to obtain eventually 
optimal recognizability for all numbers. Therefore, in spite of the fact 
that this particular width of 6 mm for the area 42 mm X 80 mm is giving 
best results only in an average, disregarding form-influences to be investi- 
gated later or form-factors due to the particular appearance of the num- 
bers, it is just this average that we are aiming at, since we want to find 
numbers of equal and optimal recognizability. This can best be obtained 
by adjusting all the numbers to exactly this average stroke-width. 

If the numbers were now all constructed with this stroke-width and 
into the same area 42 mm X 80 mm, they would all be visible at different 
threshold distances. In order to be able to adjust them to each other, the 
number 8 as the most harmonic number was chosen as the “‘standard”’ 
number. By “harmonic” is meant that equality arrived at in each half 
when the figure 8 is cut in half either vertically or horizontally. Similar 
harmony is found only in the numbers 1 and 0. The number 1 is unfit as 
a standard, due to the circumstances described above, and the number 0 
has no inner structure as most other numbers have, and is therefore also a 
kind of exception, which would be less appropriate for serving as standard. 
The most promising conditions, therefore, are offered by number 8. 

Another question to be mentioned in this connection, is the following: 
If one number is chosen as standard to which the recognizability of all 
other numbers is adjusted, are we not cutting down unreasonably the 
recognizability of those numerals, which constructed equally would be 
more legible than the number 8? Should we not rather take the best 
recognizable number as standard and adjust the legibility of the other 
numbers to this optimal legibility? This consideration is based upon an 
illusion. It is practically meaningless, whether we adjust the recogniz- 
ability of all numbers to the most or the least recognizable numeral. In 
the first case we would have to increase the lateral width of practically all 
numbers, while in the other case we would have to decrease it. Such a 
loss or gain of area would only be of importance if the absolute area of 
42 mm X 80 mm were the only available area. Since the opposite is 
correct, namely that this particular area is only one of an indefinite num- 
ber of different areas available, our approach to the solution of the 
problem of equally and optimally recognizable numbers on a particular 
area chosen should be confined to the best average conditions for this area. 
If different recognizability is required, the absolute area can be changed, 
and on the basis of the above mentioned complete series of experiments 
optimally recognizable numerals for this area will be chosen. If we would 
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try to adjust the recognizability of all numbers to the best legible numeral, 
we would risk that the average stroke-width found would be insufficient 
for the adjustment of certain numerals. Therefore, the number eight and 
the stroke-width of 6 mm are a good “standard.” Since it gains special 
importance through its choice as a standard, the number eight has been 
investigated very carefully. 

First, the experiments with strokes of different width have been re- 
peated with the number 8 alone and with three subjects. The results are 
shown in Figure 4, which was computed from a corresponding Table in 
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Fig. 4. Influence%of stroke-width upon the recognizability of the numeral 8, white on 
a black background. Average of results with 3 observers. 


the same way as Figures 2 and 3, confirming 6 mm once more as optimal 
stroke-width for the area 42 mm X 80 mm. 

Second, the form of the number eight was investigated. The main 
changes which can be made in the form of number eight, are the curva- 
tures of the lines, bending towards the center. Ten different steps of 
these curvatures were investigated with three subjects, curvature 1 having 
the strongest bending of lines towards the center. This bending de- 
creased stepwise towards No. 10, which represents a number eight, prac- 
tically consisting of two rectangles. From the results, Figure 5 was 
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Fig. 5. Influence of curvatures of lines, bending towards the center (1 = strongest 
bending) upon recognizability of numeral eight. Average of 3 observers. 
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constructed which shows that curvatures 1 and 2 are of optimal recogniz- 
ability. These are the curvatures with the strongest bending towards the 
center. The fact that 1 and 2 are equal and similar to 4 indicates also that 
a further increase of the curvatures would not increase the legibility of 
numeral eight appreciably. It is interesting to note that the angle which 
is formed by the two lines crossing in the center of the optimally recogniz- 
able eight is close to 90 degrees. The lines thus cross each other, forming 
four angles of about 90 degrees each around the center of the numeral 
eight. If any other inclination were chosen, two of these angles would be 
less than 90 degrees, and thus their recognizability diminished. This, as 
well as later form-experiments, indicates that besides width of stroke and 
ratio of stroke-width to width and height of number, the angle at which 
connecting lines run from one horizontal or vertical line to another is of 
importance for the recognizability of the numerals. 

Third, an esthetic and practical adjustment of the horizontal cross-cut 
dimensions of the standard number eight to its vertical cross-cut dimen- 
sions was made. If the area on which the number eight is constructed is 
42 X 80 mm, and the number 8 has a stroke-width of 6 mm, the vertical 
cross-cut will be: 6 mm width of upper horizontal line; 31 mm distance 
between upper and middle-line; 6 mm width of middle-line; 31 mm dis- 
tance between middle-line and base; and 6 mm width of base-line, but the 
same number will have the following horizontal proportions: 6 mm—30 
mm—6 mm. Since the actual numerals used in most countries of the 
world have the height of 8 cm, it seemed best to keep that height but to 
choose the inner distance of the horizontal cross-cut as 31 mm instead of 
30. This adjustment makes the standard-number eight perfectly har- 
monic. The proportions of the standard number eight are consequently: 


6 mm 

6 mm 31 mm 6mm 
6 mm 

6 mm 31 mm 6mm 
6 mm 


(4) Form-experiments. Having determined the optimal average width 
of stroke for white numerals of an area 42 mm X 80 mm, and having 
determined the optimally recognizable standard-number eight, experi- 
ments were conducted to find the thresholds of optimal form-recognition 
for all those numerals whose form is varied in practice. 

(a) Number 1 and number 0 have a form, which cannot be varied. 

(b) The number 2 has mainly two characteristics of form influencing 
its recognizability. The length of the line in the upper left corner and the 
bending of the line connecting the upper right corner with the left end of 
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its base. Both factors were investigated with 4 observers. It was found 
that the number 2 is best recognizable, if the line in the left upper corner is 
short, between 10 and 13 mm long, and if the bending of the line, connect- 
ing the upper right corner with the left corner of its base is a straight line. 
The results of this latter series of experiments are shown in Figure 6. 
The inclination 1 represents the straight line connection between the 
upper right corner and the left corner of the base of number 2, while the 
following numbers represent inclinations of this connecting line, moving 
the lower left point stepwise towards the middle-point between upper and 
lower horizontal. Again it is interesting to note, that if a parallel is 
drawn to the base through that point at which the bending of the connect- 
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Fic. 6. Influence of inclination of the line, touching the base of numeral 2 upon 
its recognizability. 1 indicates strongest§{bending of about 40 degrees. Average of 
4 observers. 


ing line begins, the angles at this parallel as well as at the base are nearly 
equal, in this case about 45 degrees. 

(c) The number 3 has first been investigated, using an upper half 
equal to the lower half. The curvature had little influence, but again the 
best recognizability was obtained with the strongest bent curvature. 
This seemed to indicate that the use of straight lines for the upper half of 
number three, which would bend in greater angles to each other than any 
curved line, might improve the recognizability of numeral 3. This was 
found correct. Changing the center point from the left to the right 
vertical boundary in a new series of experiments, it was found that the 
best recognizability was obtained, when the middle point for connecting 
lines lay exactly in the center. Apparently in this case, the angle of the 
connecting line alone was not decisive, but also the distance between the 
connected middle-point and the upper and lower left boundary-points was 
also important. Since the distance between base and upper line is less in 
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the left boundary of number three than in its center, the central position 
of this connecting point is more favorable for the recognition of number 3. 

(d) The number 4 has been investigated with a number of different 
viewpoints in mind. One was the distance between base and middle-line. 
Another was the right end of the horizontal middle-line, extending over 
the vertical line. Their best proportions found, the question was ex- 
amined whether number 4 would be more recognizable if, instead of con- 
necting the left corner of the horizontal middle-line with the upper end 
of the vertical line, this connecting line would be made as a vertical 
parallel to the vertical center line. This as well as all the intermediary 
steps was investigated, and it was found, that, corresponding to the 
already mentioned importance of the angles of connecting lines, the great- 
est angle was most favorable for the recognizability of number 4, namely a 
direct linear connection between the left corner of the horizontal middle- 
line and the upper end of the vertical line. 

(e) Number 5 was only investigated as to the length of its upper hori- 
zontal line. The greatest length of this line, covering the total horizontal 
cross-cut of 42 mm, was found best. 

(f) The numbers 6 and 9, being of equal form, were investigated in two 
respects. One could imagine that making the half which is covered by a 
rounded square, larger or smaller, the recognizability might be improved. 
This is not the case. If the rectangle covers exactly half of the number 6 
or nine, its recognizability is best. Again the line connecting this half 
with the upper boundary (6) or with its base (9), could be made vertical 
or with any bending angle between 0 and 45 degrees. This was investi- 
gated in steps and the greatest angle of 45 degrees was found most favor- 
able, confirming again the importance of the angle of connecting lines for 
the recognizability of the forms of numerals. Again the angle at the 
base and at the center thus is made equal, in this case about 40 degrees, or 
figuring from the center of the line 45 degrees. 

(g) Number 7 was investigated by changing the position of the line 
connecting the upper horizontal line with the imaginary base from a verti- 
cal position to the greatest possible angle. In agreement with the above 
described form experiments of other numerals it was found that the great- 
est angle was best, making again the angles both at the horizontal upper 
line and the base (imagined as a parallel horizontal) equal, in this case 
about 27 degrees. 

(5) Experiments to adjust the recognizability of all single numerals to the 
recognizability of standard number 8. As mentioned above, the numerals 
obtained at the present point of the investigation are discernible at dif- 
ferent limiting distances from the eye. Having decided to keep their 
height as well as their average stroke-width constant, there remains only 
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one way for adjusting their recognizability to standard, namely to increase 
or decrease their ‘‘inner distances,’’ that is the horizontal distances between 
their vertical boundaries. This has been done with each single number 
(with exception of number 1, which has no inner distance), by constructing 
a series of the particular number to be investigated, 80 mm high, with a 
stroke-width of 6 mm, but with different inner distances, e.g. 20, 24, 28 
and 32 mm, and determining their threshold recognizability in successive 
order. The standard number eight was used to fix standard recognition, 
before and after each series of experiments. In this way two curves were 
found and their point of intersection was used to determine the exact 
“inner distance” for each number, corresponding to the recognizability of 
the standard number 8. The results with number 0, 2, 3, 4, 5, 6 (9) and 7 
are shown in Figure 7. It is interesting to note that with exception of 
0, 2 and 7 the curves are not linear. From the curves with the numbers 
3 and 4 it can be seen that a further increase of their inner distances would 
not have improved their recognizability. They are those numerals, for 
which only a simultaneous increase of stroke-width would lead to further 
improvements. The optimal legibility of these numbers with 6 mm 
stroke-width and 80 mm height, is very close to standard, and can just be 
considered satisfactory. 

(6) Adjustment of the distance between two numerals to standard. If 
two numerals are used on the same plate, their recognizability is influ- 
enced by the distance between them. This problem is complicated by 
two facts. First, the distance between two numerals necessary for equal- 
ity with standard depends upon the particular form or structure of the 
left as-well as right side boundaries of each numeral. If these are not 
identical, both have to be investigated separately. Second, the distance 
between the numerals, necessary for standard resolution will depend upon 
the number of numerals used. The more numerals used on the same 
plate, the more distance between each number will be required. The 
present investigation has been limited to a thorough study of all pos- 
sible two-number groups, and to a rougher checking of five-number 
groups. 

Since numeral 1 is an exception and number 8 is used for standardiza- 
tion, the first series have been carried out with these two numbers sepa- 
rately. Both are perfectly harmonic and need no special experiments for 
each side. The same is true with number 0, which therefore was used in a 
third series and subsequently chosen to be used for the rest of the num- 
erals. Figure 8 shows these three experimental series plotted as the 
previous figures from the average recognition thresholds of 48’s. In these 
experiments the subjects were asked only to observe and indicate at which 
moment (measured in distances from the S’s eye) the two numerals 
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fused into one, or separated’again into two patterns, disregarding whether 
they also could recognize the numerals themselves. The points before 
and after each series are results with standard number eight. Its thresh- 
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InNER DISTtTARC ES IN om 


Fie. 7. Influence of the horizontal distance between the inner borders of the 
numerals 0, 2, 3, 4, 5, 6 and 7 upon their recognizability, with constant height (80 mm) 
and stroke-width (6 mm). The horizontal curves connect points, which represent 
series of experiments, made before and after each series, with ‘standard’ number 8 
(compare text). The point of intersection of the two curves, represented in each Figure, 
indicates the “inner” distance of each numeral corresponding to “‘standard”’ visibility. 
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old of recognizability was determined 8 times (4 times the appearance and 
4 times the disappearance of a recognizable number 8). The average of 
these observations are connected directly by a straight line, which in most 
cases is practically horizontal. The point of intersection of the two 
curves shows the distance between two numerals 1, 8 and 0, at which 
their resolution was equal to standard. For the single number, )% of that 
distance will be required, no matter which other numeral is combined 
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DISTANCES BETWEEN TWO NUMBERS IN om 


Fic. 8. Determination of the influence of distances between two numerals upon 
their resolution thresholds (m). Average results with 4 subjects are shown for the 
numerals 0,1 and 8. The points before and after the series represent threshold measure- 
ments with standard number 8. 


with it, assuming that only two numerals are used together and a further 
half of the outer distance for the other numeral is added. 

In the following experiments with the numerals 2, 3, 4, 5, 6 and 7 the 
number 0 was used to left and right of each numeral, changing the dis- 
tance between the 0 and the specific number, and using the standard- 
number eight before and after each series. The results are represented in 
Figure 9. In each case two curves, respecitvelyjthree, could be drawn; 
the point of their intersection indicates the distance required between 
two numerals for standard recognizability. Subtracting from each such 
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value the above figured half distance between two numbers 0 (Figure 8) 
gives the left and right distance for each numeral, in order to bring about 
standard recognizability distances for any two numeral combination. 
The final results for all numerals are indicated in Table 2. 





%| ON FRONT OF 2 0 BEHIND 2 © IN FRONT OF 3 0 BEHIND 5 





© IN FRONT OF S_ 0 BEHIND 5 














O'IN FRONT OF 6 40-s«O: BEHIND 6 © IN FRONT OF 7 0 BEHIND 7 
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Fic. 9. Determination of the influence of distances between 2 numerals upon their 
resolution thresholds (m). Average results with 4 subjects are shown for the numerals 
2, 3,4, 5,6and7. The numeral 0 was used as a constant before and after the numerals 
because of their unequal structure. The points before and after the series represent 
threshold measurements with standard number 8. 


When experiments were made to check the legibility of a five-number 
group, using the outer distances corresponding to the recognizability of 
standard, it was found that a five-number group is less legible than 
the standard number eight alone. The total 5 number group becomes as 
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Final dimensions for all daylight-numbers, stroke-width 6 mm, height 80 mm. 
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39.3 66.3 64 
11.1 : 12.2 12.2 6.0 30.4 42 
8.5 : 94 | 10.9 
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10.0 9.4 59.9 64 
8.5 8.9 41.2 58.6 64 


6.1 4.0 








43.0 50.4 64 





8 . . 43.0 62.0 64 
Point 6.0 6.0 16 
































visible as standard only, if the outer distance between the numbers are 
increased about 10%. These figures are included in Table 2. 

(7) Experiments with white frame. It is a well-known fact that the 
surroundings influence the !egibility of symbols or patterns on a definite 
area. “Therefore, experiments were made in order to test such an influ- 
ence. Straight lines of different widths on the top and the bottom were 
used with a five number group. It was found that the use of such 
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Fie. 10. Final appearance of all numerals, white on a black background for day-vision, 


reduced to 1/63rd of the original area. (Compare Table 2.) 
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lines affects the recognizability in the following way: If lines are used 
which are wider than the lines of the numbers, the numbers appear grayish 
and of less recognizability and the frame appears lighter. If thinner lines 
are used as frame, the recognizability of the numbers is not noticeably 
increased, and the frame appears grayish and the number lighter. When 
a frame of lines was used of the same width as the lines of the numbers, 
namely 6 mm and about 30 mm distant from the top and the bottom of 
the five-number group, the recognizability of the five-number group 


Table 3 


Comparison of the recognizability of the Danish numerals (Dan) and the numerals, found 
in this investigation, white on a black background under daylight. Height: 
80 mm, stroke-width of new numbers 6 mm, of Danish numbers 16 mm. 
(Compare Table 2.) 4 Observers, each number representing 






































8 single observations 
O.: Gi. O.: Ni. O.: Sch. O.: Ra. Average —— 
Num- 
Dan. | New | Dan. | New | Dan. | New | Dan. | New | Dan. | New| % 

O | 43.2 | 49.4 | 37.3 | 45.7 | 42.4 | 51.7 | 42.4 | 50.4 | 41.3 | 49.3 8.0 19.4 
2 | 36.6 | 44.1 | 40.3 | 42.9 | 35.3 | 45.9 | 35.3 | 46.5 | 36.9 | 44.8 7.9 | 21.4 
3 | 29.1 | 40.4 | 30.6 | 44.6 | 28.1 | 37.2 | 27.4 | 41.2 | 28.8 | 40.8; 12.0 | 41.7 
4 | 29.2 | 39.9 | 31.2 | 45.0 | 27.1 | 37.9 | 27.3 | 42.1 | 28.7 | 41.2| 12.5 | 43.5 
5 | 32.6 | 42.9 | 33.9 | 44.7 | 32.6 | 38.6 31.3 43.7 | 32.6 | 42.5 9.9 | 30.4 
6 | 32.2 | 43.5 | 31.3 | 48.4 | 34.1 | 41.4 | 27.3 | 46.1 | 31.2 | 448) 13.6 | 43.6 
7 | 37.3 | 43.7 | 37.3 | 44.7 | 35.3 | 40.8 | 32.6 | 45.9 | 35.6 | 43.8 8.2 | 23.0 
8 | 30.5 | 42.8 | 33.4 | 46.1 | 30.3 | 43.9 “34.2 49.9 | 32.1 | 45.7| 13.6 | 42.3 
9 | 33.9 | 43.2 | 35.3 | 46.1 | 35.2 | 44.9 | 37.7 | 51.3 | 35.5 | 46.4) 10.9 | 39.2 






































was definitely increased, and both numerals as well as frame appeared 
equally white. The results are shown in Table 4 as compared with five- 
number group without a frame. 

(8) Final appearance of equally and optimally recognizable white day- 
light-numerals. Figure 10 shows the appearance of all 10 numerals for 
day vision, reduced to 1/63rd of the area used during the experiments, in 
two five number groups with white frame. Comparative results, between 
5 number groups and single numerals as well, are contained in Table 3 
and Table 4. 

Summarizing these results, it can be said: The new numbers, although 
most of them covering an area 10% to 20% less wide than the Danish num- 
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Table 4 


Comparison between the recognizability of 2 five-number groups, using the Danish 
numerals as well as the newly constructed numbers, white on black background. 
Four observers. Each number of the Table is average of 8 single observations. 





Improvement in 
New New Improve- | Improve- Per Cent 
Number Number ment ment 
Without With Without With 
me Frame Frame Frame Without With 
m m m m Frame Frame 
0 % 


33.8 42.3 46.6 8.5 12.8 25.2 37.9 
35.2 47.2 48.1 12.0 12.9 34.1 36.7 
80.236 38.8 46.4 48.6 7.6 9.8 19.6 25.2 
91.475 39.0 48.1 48.9 9.1 9.9 23.3 25.4 
80.236 38.0 49.7 53.8 11.7 15.8 30.8 41.6 
91.475 38.3 49.9 53.8 11.6 15.5 30.3 40.5 
80.236 32.4 40.6 8.2 13.5 25.6 42.2 
91.475 33.7 42.1 8.4 — 24.7 —_ 






























































erals, are, singly, on an average about 33.8% more legible than the old Danish 
numbers. 

A five-number group without white frame is, on the average, 26.7% 
more legible, while with a white frame as described above the same groups 
are 35.6% more legible than corresponding Danish 5-number groups. 

In view of the fact that a group of five numbers requires more atten- 
tion than one single number, the use of a mark before the last three 
numbers is advisable. Experiments have shown that for such a purpose 
a dot of 6 mm sq., white on a black background, for white numerals is 
sufficient. No distance between the point and the above described new 
numbers (including their outer distances) is required and at long distances 
the point becomes invisible, while its effect remains only as a slight 
increase of the distance between the adjacent numerals. 


To be continued in the August issue. 
Received April 26, 1943. 











A Comparative Study of Forgery * 
Irwin August Berg 


University of Illinois 


Members of society at large frequently reveal stereotyped thinking 
about classes of crime and criminals, a tendency which often limits voca- 
tional opportunities for discharged or paroled inmates. Yet inmates 
themselves rarely reveal such stereotypes concerning their fellow prisoners 
or their offenses. Priscn officials sometimes hear inmates make an occa- 
sional sweeping generalization such as a car theft is ‘‘kid stuff” which is an 
expression more of contempt than of categorized thinking. But when 
speaking of forgers and forgery, many prisoners express opinions which 
are quite uniform. These opinions are usually expressed in institutional 
slang; however the idea conveyed is that forgers are intelligent and in- 
variably recidivistic. Common phrases used by inmates are “once a 
paperhanger (forger), always a paperhanger”’; “‘he’s a paperhanger, he’s a 
genius.” Specific forgers may be occasionally referred to by inmates as 
“Dr. 1.Q.,” “The Brain,” ete.; and while such phrases are used jestingly, 
many inmates appear to believe that forgers are superior intellectually. 

Studies such as those of Frank (3) or Murchison (8) suggest that 
forgers tend to score above average in intelligence. Such studies suggest 
that the inmate opinions concerning forgery correspond, at least partly, 
with fact. Accordingly, it was planned to tabulate all cases of forgery 
at the State Prison of Southern Michigan from Jan. 1, 1940 to August 1, 
1942 and then to analyze the data secured from this tabulation. In all, 
135 cases of forgery were secured. The information was then compared 
with a control group of 480 inmates sentenced for all types of crimes. 
The control group was secured by drawing, at random, inmate data cards 
for the zame two and one-half year period. The distribution of intelli- 
gence for this control group compares reasonably well with the findings of 
other studies. Zeleny (10 p. 576) estimates that 3.2 per cent of the male 
criminals should fall in the “feebleminded” category of intelligence. 
Under his conditions, 3.6 per cent of the 480 control group cases would be 
classed as ‘‘feebleminded.” Murchison (8 p. 43), when comparing Army 
Alpha letter grades for criminals, lists 53.8 per cent A half scores and 46.5 

* The author wishes to thank Dr. Garrett Heyns, Director of Corrections in Michi- 


gan, and Warden Harry Jackson of the State Prison of Southern Michigan for permission 
to publish these data and for their friendly encouragement during the course of the study. 
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per cent E half scores. The control group in the present study lists 53.1 
per cent A half scores and 46.9 per cent E half scores. Further details 
of the control group are given in another study by the author (1). 

The mean I.Q., mean grade placement, and mean chronological age 
when admitted to prison were calculated for both groups. Information 
from the prison records as well as interview data from the prison socio- 
logical reports were also tabulated. The Bregman Revision of the Army 
Alpha Examination (2) was used as a measure of intelligence, and the 
Stanford Achievement Test (6) was used to determine grade placement. 
Tests were administered in an isolated room between the sixth and four- 
teenth day after admittance to prison. Motivation in the testing situa- 
tion was probably high since the inmates knew that their test scores would 
be considered by the prison classification committee when assigning prison 
jobs. 

Because Army Alpha raw scores were not readily available, the con- 
verted I.Q. scores were used. It should be remembered that the general 
population Army Alpha raw score converts to an I.Q. of only 89 if a 
chronological age of 15 is employed. Thus an Alpha I.Q. of 89 may be 
considered to represent the performance of the average adult population. 
The average Alpha I.Q. of the forgery group was found to be 99.4. This 
would fall just within the superior groups of the Terman-Binet I.Q. dis- 
tribution where an I.Q. of 110 is considered to be the lower end of the 
superior group range. See Freeman (4, p. 127 ff) for further details. 


Discussion and Conclusions 


From the data presented in Table 1 it seems that the opinions of other 
inmates concerning forgers are partially substantiated. The group of 
forgers studied averaged higher in intelligence and in grade placement 
than the control group. Also, the forgery group was eight years older, 
on the average, and more than twice as recidivistic when compared with 
the all-prison sample. These differences were found to be statistically 
significant. 








Table 1 
Comparison of Forgery and Control Group Data 
Control Forgery Critical 
Number of cases Group Group Ratio 

Mean age 29.2 37.2 6.9 
Mean Grade Placement 5.2 6.9 6.6 
Mean Alpha I.Q. 89.1 99.4 6.6 
Per cent of group with previous 29.6 62.2 


prison sentences 
Per cent Negro offenders in group 24.4 6.7 
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Since the forgery group is older, it might be thought that it would 
necessarily follow that more of its members would be recidivistic. That 
is, having lived longer, the forgers would have had more years in which to 
commit crimes. If this were true, the forgers who had no previous prison 
sentences should be expected to be significantly younger than those who 
had been in prisons previously. In Table 2 the average age of non- 
recidivistic forgers was found to be 35.5 while the recidivistic forgers 
averaged 38.2 years. The critical ratio of these averages is 1.2, indicating 
that, while age is of some importance in recidivism among forgers, it is 
not nearly as important as it is among the recidivistic and non-recidivistic 
members of the control group. The critical ratio is 2.8 for the control 
group recidivists and non-recidivists. That is, the control group members 
who had been in prison previously were significantly older than the control 


Table 2 
Average Age with Reference to Recidivism 





Control Forgery Critical 





Group Group Ratio 
Without record of previous 
prison sentences 
Number 339 51 
Mean Age 28.2 35.5 4.0 
With record of previous 
prison sentences 
Number 141 84 
Mean Age 31.4 38.2 4.2 
Critical ratio 2.8 1.2 





group members who were in prison for the first time. Thus age relates to 
recidivism in the control group. But in the case of the forgery group, age 
is a less important factor in recidivism. Other factors, which will be 
considered later, are of equal or greater importance. 

There does appear to be a difference between the recidivistic and non- 
recidivistic forgers which relates to the modus operandi and the circum- 
stances in which the offense was committed. This difference can be 
determined from the prison interview and court records. The offenses 
of the 51 forgers who had no previous felony convictions tended to be 
rather direct and uncomplicated. These offenders tended simply to forge 
a name on an otherwise valid check or, less frequently, to forge one or two 
checks completely.! Except for three or four cases, there was no evidence 


1 Forgery of documents other than checks was committed by members of this group; 
however such cases were so few that it is believed unnecessary to consider them separately. 
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that any of the 51 non-recidivistic forgers prepared for their offenses by 
establishing confidence in a proposed victim by building up a “front”’— 
—i.e. lavish tips to servants, entertaining victims with dinners, and in 
general seeking to present the appearance of a solid, successful business 
or professional man. 

On the other hand, over two-thirds of the 84 members of the forgery 
group who had been imprisoned before revealed more or less careful 
preparation for the offense. Also, many checks were usually forged; over 
a hundred in several instances. When the police photographs and de- 
scriptions were circulated, it was not unusual to have complaints and 
inquiries appear by the dozen from all over the country. In preparing 
for the offense, “setting up the sucker” in prison parlance, the prepara- 
tions were sometimes very simple such as eating a meal or two in a 
restaurant and chatting with the proprietor and then offering a worthless 
check. For an offense which involved a large sum, the preparations were 
sometimes quite elaborate, including investigation of the victim’s finances, 
habits, attitudes, etc. The forgers with previous prison records often 
adopted a role calculated to impress the intended victim, pretending to be 
a minister, lawyer, executive or anything which the victim was likely to 
accept and which the forgers could carry off successfully. In the group 
studied, such careful preparations were found in only three cases among 
the non-recidivistic forgers, and only one of these three revealed elaborate 
preparations; whereas more than thirty of the 84 recidivistic forgers 
showed evidence of employing prolonged and detailed plans for offenses 
in one or more instances. 

From a social standpoint it may be difficult for Negroes to build up a 
successful “front” as easily as whites. If this is true, it would relate to the 
fact that the percentage of Negroes in the forgery group was only about 
one-fourth the Negro percentage of the control group. Of the nine Ne- 
groes in the forgery group, eight of them had been imprisoned before. 
The previous convictions of these eight had been for crimes of larceny or 
breaking and entering, none for forgery. 

In Table 3 the lower percentage of arrests for drunkenness in the for- 
gery group may merely reflect a behavior difference. The forgery group 
members may have drunk as much but behaved less riotously than control 
group members; hence they were arrested less frequently on such charges. 
The recorded interviews of the prison sociologists sometimes contained 
phrases as ‘‘self-assured,” “poised,” ete., when describing various inmates 
convicted of forgery. It may be that by virtue of appearance and com- 
portment forgers were sometimes returned to their hotels or homes in- 
stead of lodged in jail for drunkenness. 

The item “Record of Mental Observation in Prison Hospital” reveals 
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no valid difference between the forgery and control groups. The condi- 
tions for admittance to the prison psychiatric ward were fairly standard. 
Because the psychiatrist is extremely busy, an inmate’s behavior must 
usually be sufficiently aberrant to suggest that he may harm himself or 
others before a prison official will request mental observation. The psy- 
chiatrist himself detects other cases during routine interviews. Such 
observation does not necessarily mean that a psychosis exists. An in- 
mate, for example, might perhaps be given a psychiatric examination 
after a violent fight with another inmate and then discharged after the 
temporary disturbance had passed. But such observation does reflect 
prison adjustment; hence it may be said that the forgery group members 
adjust about as well as the control group members. 


Table 3 
Comparison of Criminal Record Data 





50 Cases Per 25 Cases Per 
Control Group Cent Forgery Group Cent 





Previous jail sentences, fines, 14 28 
probation for drunkenness 

Record of mental observation 7 14 
in prison hospital 

Police record of fines, sen- 19 38 
tences, or probation for acts 
of criminal impulse 

Record of previous probation 
or sentence for forgery 





The data presented in Table 3 were obtained by subjecting a random 
sample of 50 criminal record folders from the 480 members of the control 
group to intensive study. The same was done with a random sample of 
25 folders drawn from the forgery group. 

In tabulating “‘acts of criminal impulse” in Table 3, a subjective 
appraisal of previous offenses was made. Offenses such as mayhem, 
assault and battery, and certain cases of rape or manslaughter were 
evaluated as to the impulsiveness of the action. While this item was 
not objectively determined, the results appear to fit into the general 
pattern. That is, the majority of the forgers appeared to act with fore- 
thought, often planning their offenses with considerable care, almost 
never acting on impulse. Because probationary periods as well as actual 
prison sentences were included under “previous forgery” in Table 3, the 
percentage of forgers with such records is higher than the actual percent- 
age of recidivism (62.2%) which included only previous imprisonments. 
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Also, several of the non-recidivistic forgers had been on probation for 
forgery. 

The police records of some of the 84 recidivistic forgers reveal records 
of forgery, fraud, and investigations for confidence games with regularity. 
Psychopathic tendencies are evident in some cases, as illustrated by the 
case of forger X. This man, when interviewed, described how he had 
“‘passed”’ a forged check for 125 dollars. In order to establish a ‘‘front’’ 
he had spent over ninety dollars for fake telegrams, entertaining his 
victim, and so on. He spent two weeks preparing for the actual check 
passing. It was pointed out to forger X that he probably could have 
cleared much more than 30 dollars honestly if he had devoted a similar 
amount of energy to selling something, even door-to-door vacuum cleaner 
selling. Forger X admitted that this was perhaps true. The substance 
of his ensuing explanation was that he enjoyed “matching wits and then 
beating a guy out.” 

While this case is an extreme example, the records of many other 
forgers follow a similar, if less pronounced, pattern. Evidence of great 
pleasure when “beating a guy out of something”’ is encountered frequently 
among members of this group. Some forgers, like forger X, would prob- 
ably prefer to ‘“‘work” a victim for a small amount of money than earn a 
larger sum honestly for an equal effort. What Sadler (9 p. 882) refers to 
as a “‘pathologic swindler” is a case in point. But while this tendency 
may be called psychopathic, it must be recognized that it does not inter- 
fere with prison adjustment, apparently, since the “Record of Mental 
Observation” in Table 3 is about the same for the forgery and control 
groups. 

From the standpoint of mental and social equipment, the group of 
forgers studied may be considered to be better qualified to support them- 
selves in a manner acceptable to society than a random group of inmates 
of similar size. In fact, many members of this group were superior to 
many other members of society at large. But hope for rehabilitation of the 
forgery group members appears to be less than for other inmates in view of 
the unusually high rate of recidivism among forgers. It may be that if 
general rehabilitation efforts were concentrated on those forgers who man- 
ifested no psychopathic tendencies in the commission of their offenses and 
if psychiatric treatment were emphasized during the imprisonment of 
other forgers who revealed such tendencies, the rate of recidivism might 
decrease. 

It is suggested that further research concerning the psychopathic 
aspect of forgery is likely to be fruitful. A convenient instrument for such 
research is the Minnesota Multiphasic Personality Schedule (5) by 
McKinley and Hathaway. 
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Summary 


All cases of forgery from January I, 1940 to August I, 1942, totalling 
135 cases, were compared with a random sample of 480 cases of inmates 
sentenced to the State Prison of Southern Michigan for various offenses. 
The forgery group was found to be older, to score higher on intelligence 
and grade placement tests, and to be more than twice as recidivistic as 
the control group. These findings partially substantiated the opinions of 
other inmates concerning forgers. 

The forgers who had no previous prison sentences forged few docu- 
ments and rarely made detailed preparations for their offenses. The 
forgers who had been in prison before (usually for forgery or fraud) tended 
to lay careful, even elaborate plans with the aim of securing the confidence 
of their victims. Also, the members of this latter group tended to forge 
many checks. A number of the members of this latter group revealed 
evidence, in varying degree, of psychopathic tendencies in the commission 
of their offenses. It is suggested that the high rate of recidivism among 
forgers is related to psychopathic tendencies, and that further research 
in this area with the Minnesota Multiphasic Personality Schedule (5) 
would be highly desirable. 


Received July 3, 1943. 
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The Prediction of College Achievement and Satisfaction * 


Ralph F. Berdie 
University of Minnesota 


Studies of prediction at the college level have been primarily concerned 
with the prediction of academic achievement. Measures of achievement 
have usually been college grades or achievement test scores. Prediction 
measures have been based on secondary school grades, aptitude tests, 
information tests, interest tests, personality adjustment scales and infor- 
mation about the student’s family and social background. As would be 
expected, previous school achievement most accurately predicts college 
grades; interest and personality test scores least accurately predict college 
grades. College marks are definitely related to high school grades and 
scholastic aptitude. They appear to bear little, if any, relation to meas- 
ured interests. As long as psychologists are concerned with the prediction 
of college grades alone, correlations are raised negligibly by the inclusion 
of interest test scores. 

College students who seek counseling, however, desire not only to 
determine in what field they will do well but also what field they will like. 
Of 327 students seeking counseling at the University of Minnesota Testing 
Bureau, 230 wished to determine what they would be best able to do; 
164 of them wished to determine what they would most like to do (1). 

Although interest test scores do not predict how well students achieve 
in college, they have been used in counseling on the assumption that they 
help students select fields of training and work that will be liked. Coun- 
selors have helped students choose occupations that will most probably 
be compatible with both abilities and interests. 

The purpose of this study was to determine if the satisfaction a stu- 
dent derives from his college course could be predicted by his responses on 
the Strong Vocational Interest Blank (5) or by other predictive indices. 


Procedures 


A measure of curriculum satisfaction was obtained by adapting the 
front page of Hoppock’s Job Satisfaction Blank. The revised form is 
shown in Figure 1. The blank was scored by simply adding the numbers 
of the checked statements. No attempt was made to scale the state- 


* This study is one of a series of studies in process on problems of interest measure- 
ment at the University of Minnesota Testing Bureau (2) (3) (4). 
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Curriculum Satisfaction Blank 


Your success in school and college depends to a great extent upon how well 
you are satisfied with the course of study you are taking. This blank will help 
you to determine systematically how your curriculum satisfaction compares to 
the satisfaction of other students. Answer the questions in regard to the cur- 
riculum in which you are now enrolled or if you are not in school, the curriculum 
in which you were last enrolled or the one which you are contemplating. The 
results will be kept confidential. 


Name...... 





Name of Present, Last or Contemplated Curricula (pre-law, medicine, mechani- 


cal engineering, etc.) 





A. Choose the one of the following 
statements which best tells how 
well you like your curriculum. 
Place a check mark (¥ ) in front of 
that statement: 


I don’t like it. 
ee: I am indifferent to it. 


ARE I am enthusiastic about it. 


. ARR I like it better than I could pos- 
sibly like anything else. 


B. Check one of the following to show 
how you think you compare with 
other people: 

Vwthie No one likes his course better 

than I like mine. 

_ ae I like my course much better 

than most people like theirs. 

Tienes I like my course better than most 

people like theirs. 

‘5 I like my course about as well as 

most people like theirs. 

Bisieas I dislike my course more than 

most people dislike theirs. 
eva I dislike my course much more 
than most people dislike theirs. 
| Rey No one dislikes his course more 
than I dislike mine. 


Fra. 1. 


C. Check one of the following to show 
how much of the time you feel satis- 
fied with your curriculum: 


ee All of the time. 

er Most of the time. 

ees A good deal of the time. 
SRB. About half of the time. 
| Occasionally. 

a4 Seldom. 


D. Check the one of the following 
which best teils how you feel about 
your curriculum: 

bias I would change my course at 

once if I had anything else to 
which I could change. 


csindes I could change to almost any 
other course which was prac- 
tical. 


Bind I would like to change my course. 

_ eee I would like to change my course 
for another somewhat similar 
to it. 

_ aoe I am not eager to change my 
course but I would do so if it 
were more practical. 

_ Se I can not think of any course for 
which I would exchange mine. 

Ae HE I would not exchange my course 
for any other. 


Curriculum satisfaction blank used to determine satisfaction derived 
from college course. 
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ments. The odd-even reliability was determined for 144 blanks filled out 
by engineering students. The correlation between scores on the two 
halves was .77, corrected by the Spearman-Brown formula to .87. In 
terms of reliability, the blank provides a satisfactory criterion of curricu- 
lum satisfaction. 

The intercorrelations between the four items are presented in Table 1. 


Table 1 
Intercorrelations Between the Four Items of the Curriculm Satisfaction Blank. 





Items 
Items B Cc D 


A 58 53 37 
B 83 Al 
C 63 








The correlations between each pair of items are substantial. The means 
and standard deviations for the statements checked by the 144 students 
are: 


Mean = 5.26 8.D. = .77 
Mean = 4.66 8.D. = .83 
Mean = 5.67 8.D. = 1.05 
Mean = 5.58 8.D. = 1.06 
Total Mean = 21.06 8.D. = 2.98 


Inspection of the blanks reveals that total scores of 18 or below indi- 
cate dissatisfaction with the curriculum, scores of 25 or above indicate 
satisfaction. These points are approximately one standard deviation 
below and above the mean. 

During the last week of the fall quarter in 1942, curriculum satisfaction 
blanks were mailed to 310 students who entered the Institute of Tech- 
nology of the University of Minnesota in the fall of 1941 and who were 
still in residence at the University. Of these 310 students, 50% filled out 
the blank and returned it in a usable form. 

In the fall of 1941, 497 freshmen entering the Institute of Technology 
were given a battery of tests. Scores based on the following were avail- 
able for these students: High school grades; American Council on Educa- 
tion Examination for 1937; Cooperative English Test, form OM; Coopera- 
tive Mathematics test, form HS-P; Cooperative Chemistry test, form Q; 
Revised Minn. Paper Form Board, form MA; Minn. Vocational test for 
Clerical Workers. 

The honor point ratio, based on all grades earned during the first year, 
was available for each student. The score on the engineers key of the 
Strong Vocational Interest Blank was available for each of the 154 stu- 
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dents who returned the curriculum satisfaction blank. Scores on 34 
oecupational keys, masculinity-femininity and occupational level were also 
available for 43 of these students. 

Correlations were computed for the 154 cases between the predictive 
indices, honor point ratio, and curriculum satisfaction. The students for 
whom completely scored Strong tests were available were divided into 
groups having primary, secondary, tertiary, and no interest pattern in 
the engineering field (3) and analysis of variance and covariance utilized 
to determine the curriculum satisfaction expressed by each group. These 
43 students were also divided into two groups on the basis of the amount 
of curriculum satisfaction expressed and compared on the basis of honor 
point ratio, high school achievement, and interest test scores. Compari- 
sons of groups with extremely high and low curriculum satisfaction scores 
were also made on the basis of these variables. 


Results 


The correlations between the predictive indices and honor point ratio 
and curriculum satisfaction score are presented in Table 2. Each of the 
measures of ability and scholastic achievement have a significant correla- 
tion with honor point ratio. Only high school percentile rank bears a 
significant relationship to curriculum satisfaction and here the relationship 
is much smaller than that between high school percentile rank and college 
honor point ratio. Although a student’s collegiate achievement can be 
predicted upon the basis of measures of ability and information, these 
measures are not effective predictors of the curricular satisfaction ex- 
pressed by the student. 


Table 2 


Correlations for 154 Engineering Students Between Predictive Indices, Honor Point 
Ratio and Curriculum Satisfaction Score 





Correlation with Correlation with 





Honor Point Curriculum 
Predictive Index Ratio Satisfaction 
High school percentile rank +.56* +.23* 
American Council Examination +.21* + .02 
Cooperative English Test +.28* +.04 
Cooperative Mathematics Test +.45* +.13 
Cooperative Chemistry Test + .34* +.05 
Minnesota Paper Form Board : +.22* +.06 
Minn. Voc. Test for Clerical Workers: Nos. + .35* +.08 
Minn. Voc. Test for Clerical Workers: Names +.40* — .02 
Engineers Key of Strong Test +.13 +.10 





* Significantly different from 0.00 at 1 per cent level of probability, 
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The extent to which the vocational interests of an engineering student 
resemble the interests of successful engineers, as shown by his score on the 
engineers key of the Strong Interest Blank, is not significantly related to 
his honor point ratio nor his curriculum satisfaction. The low correla- 
tions indicate that neither academic achievement. nor the amount of 
satisfaction expressed by a student can be predicted by this one score on 
the Strong test. ° 


Analysis of the data for the 154 cases revealed a correlation of +.23 
between honor point ratio and curriculum satisfaction score. When those 
cases having curriculum satisfaction scores of 18 and below were compared 
with those having scores of 25 and above, the dissatisfied group had a 
mean honor point ratio of .89, the satisfied group a mean honor point ratio 
of 1.33. The “‘t” test was significant beyond the one per cent level of 
probability. The two groups were not significantly different on the basis 
of high school percentile rank and scores on the engineers key. 

Both the correlation of +.23 and the “‘t’’ of 2.67 demonstrate that 
those students who achieve most successfully tend to express most satis- 
faction with their curriculum, but that college grades do not play a great 
part in determining curriculum satisfaction. 

The five most satisfied and the five least satisfied cases were then com- 
pared with each other. The mean curriculum satisfaction score for the 
satisfied group was 26.80, for the dissatisfied group 12.00. The mean 
honor point ratio of the satisfied group was 1.60, of the dissatisfied group, 
.788. The mean raw score on the engineer’s scale of the interest blank 
for the satisfied group was 51.80, for the dissatisfied group, 35.20. Using 
the “‘t”’ test, these differences all approached significance but none of them 
(except the difference of the means of the curriculum satisfaction scores) 
reached the .05 level of probability. 

The 43 people for whom completely scored Strong blanks were avail- 
able were divided into four groups: those with a primary interest pattern 
in the engineering fields, those with a secondary interest pattern here, 
those with a tertiary interest pattern and those with no interest pattern in 
engineering. 

Analysis of variance revealed that no significant differences existed 
between these groups on the basis of honor point ratio or high school 
percentile rank. The groups differed significantly however (at the 5 per 
cent level of probability) on the basis of the curriculum satisfaction score, 
that group having no pattern being significantly less satisfied than the 
groups having interest patterns in engineering. There were, however, 
only four cases in the “no pattern’? group and these results should be 
determined on a larger group. 

One hypothesis tested in this study and not supported by the data was 
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that a conflict of measured interests, as shown by the simultaneous pres- 
ence of two opposing interest patterns for an individual, would be a dis- 
rupting factor and influence the student’s academic success and curriculum 
satisfaction. 

Those cases for whom completely scored Strong blanks were available 
were divided into two groups, one having only scientific interests, the 
other with scientific interests and other interests. The people in the first 
group had either a primary or secondary interest pattern in one of the 
technical or scientific job families on the Strong profile and no other pri- 
mary or secondary pattern. The second group had both a scientific 
pattern and a non-scientific primary or secondary pattern. The two 
groups were found to be no different on the basis of honor point ratio, high 
school percentile rank or curriculum satisfaction score. The presence of 
two conflicting measured interest patterns does not appear to interfere 
with achievement or influence curriculum satisfaction. 

The correlation between masculinity-femininity and honor point ratio 
and curriculum satisfaction, and between occupational level scores and 
these two variables was also determined for these 43 cases. These corre- 
lations were: 


Masculinity-femininity and honor point ratio 14 
Masculinity-femininity and curriculum satisfaction 07 
Occupational level and honor point ratio .03 
Occupational level and curriculum satisfaction 01 


None of these correlations are significantly different from zero. Mas- 
culinity-femininity and occupational level can not be used to predict either 
collegiate achievement in engineering nor curriculum satisfaction. 


Summary 


Effective counseling at the college level involves the prediction of both 
achievement and satisfaction. Tests of ability and information are the 
most successful predictors of academic achievement. This study was 
conducted with the purpose of determining if tests of vocational interests 
could be used to predict a student’s satisfaction with his curriculum. 

A curriculum satisfaction blank was filled out by 154 engineering 
students, who a year before at entrance to college had taken a battery of 
ability, achievement and interest tests. The relationships between cur- 
riculum satisfaction, college grades, and test scores were analyzed. 

The results indicate that no single factor bears a high relationship to a 
student’s satisfaction with his curriculum. Satisfaction is significantly 
related to academic achievement but the correlation between these two 
variables was only .23. There is evidence that students with no primary 
interest pattern in engineering will be least satisfied with their school 
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courses and that those who are extremely satisfied or dissatisfied might be 
differentiated upon the basis of the engineers key of the interest blank. 

The results of this study do not demonstrate that interests will or will 
not predict curriculum satisfaction. They do suggest that this might be a 
profitable field of study and that a more complete measure of satisfaction, 
a more heterogeneous group of people, and a longer time interval might 
provide more conclusive results. 
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Problems of Upperclass Students in a Teachers College 


Francis W. Hibler 
Stevenson, Jordan, and Harrison, Chicago 
and 
Arthur Hoff Larsen 
Illinois State Normal University 


Problems of young people of college age are recognized as being im- 
portant. The added pressure placed upon such students by the present 
crisis has emphasized the importance of their problems to an even greater 
degree than heretofore. The problems in themselves are, of course, im- 
portant, but more so is the matter of helping students to solve their 
problems once they have been defined. This report is based on an attempt 
to show one procedure that may be used. 


Procedure 


The college form of Mooney’s Problem check list ' was administered to 
204 upperclassmen at Illinois State Normal University in February, 1943. 
This group was composed of 110 juniors and 94 seniors. The check list 
was presented to students in certain of their education classes in order to 
obtain their reactions in as nearly a normal situation as possible. They 
were told that this study of their problems had been suggested by an 
investigation which had been made of the problems of college freshmen 
and through a real desire on the part of the faculty to be of service through 
better understanding of their difficulties. 

The check list consists of 330 items and five questions. The students 
are instructed to ‘‘read the list slowly, pause at each item, and if it sug- 
gests something that is troubling you, underline it.” After they have 
completed this step they are further instructed to “look back over the 
items you have underlined and circle the numbers in front of the items 
which are of most concern to you.” They are then asked to answer the 
five summarizing questions. 

In the discussion which follows we shall use the terms underlined aad 
circled items in accordance with the directions referred to above. Under- 
lined items will also be referred to as total problems and circled items as 


1Ross L. Mooney. Problem check list. Columbus, Ohio: Bureau of Educational 
Research, Ohio State University, 1941. 
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serious problems. In addition we shall refer to categories of problems 
since the check list divides the items into eleven categories of thirty items 
each. The list of categories is presented in Table 4. 


Results 


The mean number of items underlined by the 204 students was 18.2 
per person and the mean number circled was 4.4. While these figures 
give a picture of the problems faced by students, a better picture may 
probably be secured by a study of the items most frequently underlined 
and those most frequently circled. 

Table 1 shows the frequency and rank of the seven most frequently 
underlined items as well as a standard score for each item. This latter 


Table 1 
Items Most Frequently Underlined 





Statement of Problem Frequency Rank Z-Score 





Too little chance to read what I like 
Not enough sleep 

Disliking financial dependence on family 
Not enough time for recreation 
Wondering if I’ll be successful in life 
Afraid to speak up in class discussions 
Taking things too seriously 


5.49 
3.90 
3.72 
3.37 
3.37 
3.37 
3.20 


74 
56 
54 
50 
50 
50 
48 





statistic is used in later comparisons. In this and the following tables 
the number of items to be included was determined arbitrarily by inspec- 
tion of the data. 

A second matter of import is the frequency with which certain prob- 
lems are indicated as serious ones. The seven most frequently circled 
items appear in Table 2. 

The Z-scores indicated in each of the foregoing tables provide a means 
for comparing total and serious problems. Table 3 shows the seven 


Table 2 
Items Most Frequently Circled 





Statement of Problem Frequency Rank Z-Score 





Lacking self-confidence 22 1 3.48 
Disliking financial dependence on family 19 2 2.91 
Taking things too seriously 17 3.5 2.52 
Afraid of speaking up in class discussions 17 3.5 2.52 
Sickness in the family 14 5 1.95 
Wondering if I’ll be successful in life 13 6.5 1.76 
Putting off marriage 13 6.5 1.76 
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problems which may be considered most serious on the bases of both un- 
derlining and circling. It was prepared by adding the standard scores 
and ranking the items accordingly. It will be observed that while an 
item may not appear in one or the other of the first two tables its standard 
score in the case where it does appear is high enough to include it in 
Table 3. 

As indicated earlier the 330 items may be grouped into the eleven 
categories shown in Table 4. This table shows the rank order of cate- 
gories in terms of frequency of underlined and circled items. The rank 
correlation coefficient is .27 with a standard error of .23. The usual 
statistical procedure would indicate that this observed coefficient of 
correlation is not very significant. 

By using the chi-square statistic we may secure a more definite test of 
homogeneity between the two types of problems within the categories. 








Table 3 
Most Serious Problems on Basis of Frequency of Underlining and Circling, 
Weighted by Standard Scores 

Statement of Problem Combined Z-Score Rank 
Disliking financial dependence on family 6.63 1 
Lacking self-confidence 6.24 2 
Too little chance to read what I like 6.11 3 
Afraid of speaking up in class 5.89 4 
Taking things too seriously 5.72 5 
Not enough sleep 5.28 6 
Wondering if I’ll be successful ir life 5.13 7 





Chi-square is 48.03 which, with ten degrees of freedom as we have here, 
means that we may be practically certain that there are real differences 
between underlined and circled problems from category to category. 
This statement may be made since a chi-square of the size here obtained, 
with the indicated number of degrees of freedom, goes far beyond the one 
per cent level. Stated in other words it means that in far less than one 
per cent of all random samples would we expect to find a chi-square as 
large as the one found here, if there were no real differences between under- 
lining and circling. 

What is probably more important is the categories which make the 
greatest contribution to this large chi-square and may therefore be con- 
sidered to have large differences in the extent to which they represent 
total problems and serious problems. These categories, whose chi- 
squares would themselves be significant beyond the one per cent level, are 
as follows: social and recreational activities; curriculum and teaching 
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Table 4 
Rank Order of Problem Categories 





Rank Order Rank Order 
of Total of Serious 
Problem Category Problems Problems 


Health and Physical Development 
Finances, Living Conditions, and Employment 
Social and Recreational Activity 
Social-Psychological Relations 
Personal-Psychological Relations 
Courtship, Sex, and Marriage 

Home and Family 

Morals and Religion 

Adjustment to College Work 

The Future,—Vocational and Educational 
Curriculum and Teaching Procedure 
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procedures; courtship, sex, and marriage; and finances, living conditions, 
andemployment. It should be pointed out that in the first and second of 
these categories the tendency was to have fewer serious problems than 


might be expected assuming that the distribution of underlined items was 
the same as that of circled items. 


The final part of the test consisted of five questions which the students 
were asked to answer. Their responses are tabulated in Table 5. In 


Table 5 
Answers to Questions About Check List 





Number Per Cent 
Answering Answering 





Question Yes No Yes No 





Do you feel that the items on the list 183 13 90 6 
give a well-rounded picture of 
your problems? 
Have you enjoyed filling out the list? 180 14 88 
Whether you have or have not en- 12 89 
joyed filling out the list do you 
think it has been worthwhile doing? 
If the opportunity were offered, 
would you like to talk over any of 
these problems with someone on 
the college staff? 
If so, do you know the particular 
person(s) with whom you would 
like to have these talks? 
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connection with question 5 it may be pointed out that 40, or 16 per cent 
of the 204 students, listed the name of the person with whom they would 
like to talk. 


Discussion of Results 


In Table 1, dealing with the underlined items, it is interesting to note 
that three of the first four items indicate that the students do not have 
enough time to read, sleep, and play. The other item indicates a dislike 
for financial dependence on the family which is also, at least indirectly, 
related to the time factor. The prevalence of this type of item as being 
the most frequent facing the juniors and seniors would seem to indicate 
that the question of organization of one’s activities is the most wide- 
spread problem facing the average upper-class college student in this 
study. In considering what can be done, both the academic program and 
the personal habits of the student, must be studied. Certain it is that 
this problem is wide-spread enough in this particular population that ade- 
quate personnel procedures demand a careful study in the use of the 
students’ time. 

Turning to the circled problems, or those of much concern, it is inter- 
esting to note that “lack of self-confidence” is the item indicated with the 
greatest frequency. Among the first four items three of them indicate 
feelings of inferiority or worry in personal relationships, while the fourth 
one again has to do with disliking financial dependence on the family. 

Hence, the picture presented by a study of the circled items is that 
students are most concerned about their problems in human relationships, 
particularly those of personal status in their college environment. The 
patterns presented by the two types of responses, therefore, are quite 
different. The underlined items seem to deal mostly with problems 
concerning the distribution of time in the accomplishment of assigned 
tasks and recreational activity, while the serious problems center around 
the gaining of skill in personal relationships. It would appear from these 
results that juniors and seniors in this sampling do not feel that they have 
achieved adequate personal adjustment since approximately 11 per cent 
of them indicate that lack of self-confidence is still a major problem. 

Turning to Table 3 where the underlined and circled problems are 
combined the results indicate an intermingling of these problems as would 
be expected. In the opinion of the authors, although such a treatment of 
the data gives an over-all view of the problems most frequently checked, 
it does not offer as constructive suggestions in the use of the data as 
considering each type of marking separately. 

Turning to Table 4 which gives the rank of the various categories we 
find social and recreational activity has the greatest frequency of under- 
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lined problems. This ties up with our former statement as to the most 
wide-spread problems. The category of “personal-psychological rela- 
tions” comes first among the serious problems, while problems concerning 
the “future,—vocational and educational” rank second. When con- 
sidering the serious problems this group of upperclassmen ranked the 
category ‘adjustment to college work”’ last, and the category “curriculum 
and teaching procedures” just ahead of it. It would seem from these 
data, therefore, that members of this group were much more concerned 
with their own problems in human relationships than with the academic 
problems related to their college life. 

Table 5 lists the results of the answers to questions about the check 
list. It is interesting to note that 90 per cent of this group stated defi- 
nitely that they believed the items on the list gave a well-rounded picture 
of their problems, and that 88 per cent of them said that they enjoyed 
filling out the list. Inasmuch as the matter of compulsion was carefully 
controlled when the lists were given, this would seem to indicate that 
students are anxious and willing to cooperate in any personnel procedure 
which they believe is designed to help them. The fact that 89 per cent of 
them stated they believed the list was worthwhile, regardless of whether 
or not they enjoyed filling it out, would indicate that most of the students 
felt that making out the list had a certain cathartic value. Approxi- 
mately half thought they would like to discuss their problems with some- 
one on the college staff, but only 29 per cent knew the particular individ- 
uals to whom they would like to talk. This would indicate that even in 
their junior and senior year many students feel the need of discussing 
their problems with a faculty counselor and that almost half of this group 
do not yet know the particular person with whom they would like to talk. 
It is probably reasonable to assume that this latter group, therefore, will 
not take steps to have conferences, unless encouraged to do so by special 
techniques or unless their problems become very acute. 

One of the most informative items in the actual use of the problem 
check list in a counseling situation was the second question which we 
were not able to summarize effectively. It read: “(How would you sum- 
marize your chief problems in your own words? Write a brief summary.” 
As all of the above questions were answered after they had made out the 
check list, no doubt these statements were more specific and objective 
than one would expect if they had not had this experience. Of the 
several faculty members who studied the answers to the above questions, 
all were impressed with the objectivity and clarity that was almost uni- 
versally present in the statements of this group of upperclassmen. In 
general, the data included in their own statements of their problems was 
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of considerable value when these students were later interviewed in the 
mental health service. 


Interpretation of Results 


Perhaps the first consideration in the interpretation of the above data 
is the recognition of the difference between a stated problem and an actual 
conflict in the life of any individual. Obviously, these data are stated 
problems which in themselves are frequently symptoms of some underly- 
ing difficulty rather than an accurate picture of the underlying problems 
of upper-class college students. This point is given weight by a com- 
parison with a previous study ? made of the freshmen in the same uni- 
versity. The freshmen indicated that their three most serious problems 
were: “Don’t know how to study; getting low grades; and concerned 
about military service”; while the above data indicate that the most 
serious problems for upperclassmen are: “lacking self-confidence, dis- 
liking financial dependence upon the family,” while there was a tie for 
third between “taking things too seriously” and “not speaking up in class 
discussions.” The picture of the freshmen was that of a more generalized 
feeling of difficulty in making adequate adjustment to the college environ- 
ment; while the data on the juniors and seniors show a more specific 
identification of the problems to be solved. These results are born out by 
clinical observations of college students in a counseling situation. 

Another problem that must be of concern in the interpretation and use 
of these data is that of determining the relative merit of the underlined 
and circled items in giving adequate pictures of individual students. 
This problem is not easily decided and the opinions of the authors are 
derived by using consultation experience as the criterion in our attempted 
validation of this technique. Such a validation is not statistical in the 
usual sense. The use of any statistic which would group the data would 
only serve to confuse the real issue. Each student must be considered as 
a study in himself, and the data on his problem check list must be checked 
against a study of his problems by other clinical methods. One of the 
authors, by virtue of his position, has been able to do this in a large num- 
ber of the cases involved in this study. Most of the students who made 
out the blanks have been members of his classes in mental hygiene in 
which they have been required to write a careful analytical autobiography 
of themselves. These autobiographies have been discussed with many 
of the students at a later date. As director of the student mental health 
service many opportunities have also been afforded to study the most 
serious deviates in terms of their stated problems and actual conflicts. 


? Houston, V. M., and Marzolf, 8. 8. The use of the problem check list in college 
personnel work. To appear in the June 1944 issue of the Journal of Higher Education. 
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In the light of the above experiences it is our opinion that the under- 
lined and circled items are quite different in their significance in the 
clinical picture. Our findings may be summarized as follows: 

1. The underlined items in general tend to be irritations more than 
major problems. In fact, study of these items has not been particularly 
helpful to the mental health service even as a screening technique, which 
should probably be their major function. 

2. The circled items, or those of most concern to the student, have 
proved the more diagnostic of the two, both in the screening process and 
as indicating the areas in which major conflicts lie. 

3. The combination of items by use of the z-score technique does not 
appear to offer any advantages over the consideration of the serious 
problems by themselves. 

4. No particular pattern has yet been discovered by the authors 
which is “typical’’ of cases that may prove to be serious when they come 
to the mental health service. 


Received June 5, 1948. 
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Studies in International Morse Code 


Ill. The Efficiency of the Code as Related to Errors 
Made During Learning 
F. S. Keller and W. N. Schoenfeld 
Columbia University 


The first Morse Code, proposed by Morse himself in 1832 (6, Appen- 
dix), consisted of ten signals, representative of the ten digits and por- 
trayed visually as dots and spaces between dots. Thus, 1, 2, 3, 4, and 5 
were indicated by like numbers of dots presented in rapid succession with 
each signal followed by a space which was perceptibly greater than the 
intra-signal space. The digits 6, 7, 8, 9, and 0 were also indicated by one, 
two, three, four, and five dots respectively, but the pause after each of 
these signals was two-thirds again as large as that which followed each 
of the first five digits (see Table I-A). This code was truly crypto- 
graphic: the letters and words of the English language were to be repre- 
sented by digits and digit-combinations; and Morse spent long hours in 
the construction of a “telegraphic dictionary” for use in encoding and 
decoding messages. 

During this period, Morse was thinking in terms of the simplest form 
of automatic transmission of signals. By 1837, after he had fully realized 
the possibilities of human manipulation in sending the signals, he had 
invented a “telegraphic alphabet” (Table 1-B), in which dashes or lines 
supplemented the dots and spaces of the original code, and a new intra- 
signal space was occasionally employed. The habit of economizing per- 
sisted, however, and three of the new signals were made to do service for 
six of the letters. 

By 1844, when the now-famous message— What hath God wrought!” 
—was transmitted from Washington to Baltimore and back again, the 
American Morse Code was perfected (Table 1-C).! Even in the later 
forms, however, Morse’s thrift was not to be denied. He chose his sig- 
nals only after a visit to a printing office where he determined the fre- 
quency of letter usage in English by reference to the type supply in the 
compositor’s case (5, see Plate facing p. 68). A glance at Table 1-D will 
show how he attempted to relate the number of signal components to the 


1 The earliest publicly transmitted message, at New York University, in 1838, was 
“ Attention the Universe, by kingdoms right wheel!’”’ The code used in this message was 
almost identical with the one shown in Table 1-C. 
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Table 1 
Important Steps in the Development of the International Morse Code 





A B Cc D E 
Morse’s Code Morse’s Morse’s Code Quantity Present 
1832 Code 1844 or Earlier of International 
1837 or (“‘ American Printer’s Morse 
Earlier Morse’’) Type Code 
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4400 
=lunit 12000 
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J 
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M 
N 
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P 
Q 
R 
8 
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U 
V 
Ww 
x 
Y 
Z 
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Note: An asterisk in columns B and C denotes a space equal to two units (dots); in 
column A, a space equal to one unit (1). 
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quantity of type for each letter found in the type-setter’s case. Whether 
he made type counts for the digits is uncertain. It will be seen that, in 
general, there is an inverse relation between the frequency of the letters 
and the number of dots and dashes in the signals assigned to them. 

Morse’s fina! offering became the accepted code in the United States 
and Canada. Even today, in spite of the development of radio and 
telephonic communication, it is used to some extent in American railway 
and commercial telegraphy. In Europe, however, there grew up a modi- 
fication, German in origin, of this ““American Morse,’’—a revision which 
changed almost entirely the digit signals and replaced those alphabet 
signals for which Morse had used an additional intra-signal space (viz., the 
signals for C, O, R, Y, and Z) (9). This “International Morse,’’ sometimes 
called ‘Continental’ code, is now almost universally employed in radio- 
telegraphy (8). 

It does not appear that Morse, in constructing his code, was concerned 
with the discriminability of the signals for the person whose task it was to 
receive them. He can hardly be blamed for failing to appreciate the fact 
that high-speed transmission of signals by skilled operators might one day 
introduce sources of perceptual error which could have been avoided by a 
happier choice of signals. The wonder is that his selection was so good. 
Complaints by telegraphers throughout the years were relatively few, and 
were concerned mainly with those signals containing extra internal spaces, 
with which the International code is not burdened. 

In connection with this question of psychological efficiency, it has been 
of some interest to relate the frequency of letter usage, in both English 
and German, to the difficulty experienced by beginners in learning to 
recognize the signals of International Morse. No adequate studies of 
learning to recognize American Morse are at present available. ‘‘Diffi- 
culty” here refers to difficulty in learning and does not apply to the diffi- 
culties experienced by skilled operators. 

Two recent investigations (3, 7), utilizing essentially the same training 
method (2), have been made for the purpose of determining the relative 
difficulty of these signals during learning. One of these studies, by Keller 
and Taubman, used hand-sending and involved mastery of both alphabet 
and digit signals. The other, by Spragg, used machine-sending and 
involved mastery of alphabet signals only. In each case, the “‘whole 
method” was employed, i.e., all characters were represented in each 
practice run, and difficulty was measured in terms of errors of omission 
and substitution. The rank orders of difficulty presented in Table 2 
were arranged in each case on the basis of substitution errors alone, high- 
est rank being given to those signals evoking most errors. The Keller- 
Taubman rank order, in column I of the table, was constructed by omit- 
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Table 2 
The Rank Orders of Difficulty during Learning, and the Frequency of Usage in 
English and German, of the Letters of the Alphabet. The subscripts to the rho’s refer 
to the Roman numerals at the heads of the columns. The data on usage frequency are 
from L. D. Smith, Cryptography, Norton & Co.: New York, 1943. 





I II III IV 
‘Rank Order of Rank Order Rank Order Rank Order 
Difficulty in the of Difficulty of Usage of Usage 
Keller-Taubman in the eer Frequency 
Character Study Spragg Study in English in German 





21 18 4 9 
17 16 20 16 
9 9 12 14 
10 15 10 8 
26 26 J 1 
12 8 16 18 
5 ll 18 11 
16 17 9 10 
23 20 6 4 
1l 7 25 23 
13 13 22 20 
6 10 ll 12 
20 22 15 15 
18 21 5 2 
22 25 3 13 
2 2 19 22 
4 5 24 24 
14 14 7 3 
19 23 8 5 
24 24 2 6 
6 14 7 

19 21 21 

4 13 19 

3 23 25 

1 17 26 

1 12 26 17 
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— .66; P.E. = .08 
= —.81; P.E. = .05 
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tan = —51;P.E. = .10 
Pr.ry = —-53; P.E. = .10 
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ting from consideration all errors occurring through confusion of alphabet 
with digit signals. In spite of certain minor differences in the training 
technique employed, and the small number of subjects used to determine 
the Spragg rank order (column II), the two orders are highly correlated.? 


This correlation may be taken to indicate the specific nature of the stimulus 
generalizations involved in this kind of learning. The Keller-Taubman rank order was 
obtained by subtracting the digit-letter generalizations of each signal from its substitu- 
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The relation between these rank orders and those obtaining for letter 
usage in English and German may be observed in the correlation coeffi- 
cients of Table 2. In each case, an inverse, although imperfect, relation 
exists between letter usage in both languages and the difficulty ratings of 
both error studies. It may be noted, however, that Spragg’s rank order 


Table 3 





correlates higher with both the German and the English letter-usage fre- 
quency than does the Keller-Taubman rank order; and that, in both eases, 
the correlation with the German is higher than the correlation with the 
English. If we measure efficiency in terms of the correspondence between 
tion-error total. The result was practically identical with the order obtained from the 


Spragg experiment in which the digits were not taught. The failure of the digits to 
provide a “‘contextual’’ influence is the important point here. 
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letter usage and learning difficulty, it would seem that the German lan- 
guage has a slight advantage in the matter! 

In the interests of greater psychological efficiency, for both the novice 
and the expert, it is of importance to consider the types of error commonly 
encountered in receiving code. Such information is prerequisite to any 
thoroughgoing analysis of the variables involved and any practical con- 
siderations. To this end, use has been made of error data presented in 
the second paper of this series (3) and reproduced in Table 3. In this 
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Fic. 1. The plot shows how difficulty varies for the character groups, that is, how 
the four groups of characters are discriminated by the total error score. The straight 
line was drawn in by inspection. The table gives the substitution and omission error 
scores making up each group’s total. Separate plots of these errors will yield two curved 
graphs departing slightly, and in opposite directions, from the straight line shown. 

Total Substitution Omission 

Errors Errors Errors 
Group 1 6401 2668 3733 
Group 2 4836 2129 2707 
Group 3 3464 1700 1764 
Group 4 1913 589 1324 





table, the code signals are ranked in order of total difficulty, measured in 
terms of errors of substitution and omission, although substitutions alone 
are entered in the body of the table, and then only when they are greater 
than five in number. 

Without disturbing this rank order, these thirty-six signals were 
divided into four groups of nine each. Group I, the most difficult, con- 
tained the signals at the head of the rank order, namely, P, W, J, F, Y, 
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G, Q, L, and Z. Group II comprised 3, U, B, 2, C, D, X, 6, and K. 
Group III comprised 1, R, 7, 4, 8, H, 9,5, and V. Group IV, the easie.t 
to learn, included 8, N, O, M, A, I, T, ¢ (Zero), and E. Although this 
ranking is based upon omissions plus substitutions, and the following 
analysis is concerned with substitutions alone, it seemed advisable to use 
this rank order since it probably affords a more stable measure of difficulty, 
being derived from a greater number of stimulus presentations. 

The validity of this breakdown is attested by Figure 1, in which total 
errors are plotted for the four groups mentioned, and in which difficulty is 
seen to vary in an orderly fashion. The table accompanying this figure 
contains the data on which the plot is based. The entries in the first 
column were obtained by summation of total substitution and omission 
errors for the nine signals in each of the groups. This may be checked by 
reference to the earlier paper (3, Table 1). In the second and third 
columns, the total error score for each group has been broken down into 
its substitution- and omission-components. A plot of either of these will 
show adequate differentiation between the groups. 

In the opinion of the authors, the problem of Morse code reception is 
essentially a problem of discrimination; and the errors made in such learn- 
ing are considered primarily as examples of stimulus generalization, in 
which a given response may be aroused by two or more stimuli sharing in 
one or more properties. From this point of view, Figure 1 represents the 
relation between generalization, as expressed in total error score, and the 
four character groupings. 

In line with the foregoing, Figure 2 is presented as evidence that the 
“difficulty” of the four signal groups is also related to the number of cases 
of generalization. The table accompanying this figure was constructed 
in the following manner. In Group I, the signal for P was found to 
generalize with eighteen other signals, J with twenty-three others, and so 
on. A count was made of the number of such cases for the nine signals of 
each of the four groups, disregarding the actual frequency within each 
case. For example, it was found that the response B was made thirteen 
times to the presented signal P, whereas C occurred twenty-nine times, 
yet these are counted simply as two cases. While it is apparent that these 
cases of generalization are of unequal strength, as indicated by their 
unequal frequencies, it is interesting that such a crude measure as a simple 
count should, in effect, yield a “difficulty” curve,—a result which may be 
attributed in part to the broad signal groupings chosen. 

In accordance with the emphasis upon stimulus generalization, the 
designation of signal properties would seem to be of fundamental im- 
portance in accounting for error. It is, however, to be noted that varia- 
bles other than those of the external stimulus situation may play an 
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important role. The specific response to a specific signal may be influ- 
enced by factors introduced by the organism as a result of responses made 
to prior stimulation, and this influence may obscure the immediate rela- 
tion in which the psychologist is most interested. For this reason, a 
consideration of stimulus characteristics which distinguish difficulty 
groups will be postponed until an attempt has been made to classify 
certain types of error in a manner inclusive of both stimulus and response. 
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Fic. 2. The plot shows the number of cases of inter-signal generalization within each of 
the four groups. The straight line was drawn in by inspection. 


Number of Cases of 
Inter-Signal 
Generalization 


Group 1 153 
Group 2 120 
Group 3 83 
Group 4 44 


Figure 3 pictures the trend of what may be called “reversal” and 
“inversion” errors for the four groups of characters. A reversal error is 
one in which a signal is mistaken for its “mirror image’’—for example, 
F (----) is mistaken for L (-—--); whereas an inversion error is one in 
which the signal is mistaken for one that possesses the same number of 
components, but in which dot replaces dash and dash replaces dot—for 
example, K (—--) is mistaken for R (-—-). The table entries were arrived 
at in the following fashion. Taking the letter P, of Group I, it may be 
seen that a reversal error is impossible, since the mirror image of (-——-) is 
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(----); and that the inversion error for P, which is X (---—) occurred 
twenty times (see Table III). For W (-—-), the reversal error, G (---), 
occurred forty-six times; and the inversion error, D (---), twenty-seven 
times. In this manner, the two types of error were separately summed 
for the nine signals of each group, and the group totals were plotted. 
Figure 3 shows that the reversal errors bear some relation to the generali- 
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Fic. 3. The graphs (fitted by inspection) for reversal errors (circles) and inversion 
errors (triangles) in the four character groups. If the sum of the reversal and inversion 
errors for each group were plotted, the result would resemble the reversal error graph. 


Reversal Inversion 
Errors Errors 
Group 1 256 177 
Group 2 109 198 
Group 3 73 198 
Group 4 41.5 217.5 


Note: The fractional values for Group 4 result from splitting the N-A and A-N 
errors equally between reversals and inversions. 


zation plots of Figures 1 and 2, in that they decrease regularly in frequency 
(although not in rectilinear fashion) as “difficulty’’ decreases. The in- 
version errors do not show this, but are included because of their high fre- 
quency and their possible dependence upon an interesting stimulus prop- 
erty to be considered later. 

Figure 4 is a plot of “continuation” errors. These errors, the origin 
of which is still somewhat puzzling, may best be explained by illustration. 
If the signal for the letter P were immediately repeated without the usual 
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inter-signal space, the result would appear visually as ( ' 

this dot-dash series is examined, it will be seen that two other four- 
component signals may be abstracted from the series, namely X (-—- -—) 
and Z (—---); and, by reference to Table 3, it will appear that X was 
given as the response to P twenty times, and Z was given fourteen times. 
Since, however, X was treated earlier as the inversion error for P, only the 
Z response is considered as a continuation error. To take a second exam- 
ple: the continuation of W gives ( , from which, again limiting the 
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The plot (fitted by inspection) for continuation errors. 
Continuation 
Errors 
Group 1 160 
Group 2 73 
Group 3 35 
Group 4 0 





errors to the same number of components as in the presented signal, one 
may abstract G (-—-) and K (---). In this case, since G is already sub- 
sumed under reversal errors, only the K is used as a pure continuation 
error, occurring with a frequency of twenty-nine. In this way, continua- 
tions were tabulated for the table accompanying Figure 4. The plot 
itself illustrates the relation between these errors and group difficulty. 
There is one extremely important source of error which, although unre- 
lated to group difficulty, exerts a powerful influence in the case of certain 
signal confusions. This is the so-called “dotting” error, which consists in an 
underestimation of the number of dot-elementsin a signal, and which is espe- 
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cially noticeable in the generalization of 5 (---- - ) with H(----)and H with 
8 (---), as well as 6 (—----) with B (----) and 4 (-----) with V (----). 
These four cases, out of a total of four hundred, account for approximately 
one-fifteenth of all the errors represented in Table 3. If to them be 
added the confusions of V with U (---), B with D (-- -), S with I (- -), ete., 
as well as all those which involve the neglect of one or more dots regardless 
of position within the signal, more than eleven per cent of the total substi- 
tution errors can be accounted for, and these do not include any errors 
classified under other headings. Moreover, if one were to add dotting 
errors of overestimation (H for 8, 5 for H, etc.), this percentage would be 
still greater. 

Errors of “partial inversion,’ wherein a single dot, either in an initial, 
internal, or final position within the signal, is taken as a dash, or a single 
dash is taken as a dot, may be treated as distinct from the inversions 
already considered. Examples are : F (--—-) giving C (----), L (----) 
giving P (.-—-), P giving J (----), G (-—-) giving R (---), P giving F, 
and C giving Y (--—-). These errors, which have not already been sub- 
sumed under other headings, constitute approximately thirty-two per cent 
of the total substitutions of Table 3. (Data reported by Spragg (7) give 
greater emphasis than do our own to partial inversions of final com- 
ponents.) 

Errors of inversion, reversal, continuation, dotting, and partial inver- 
sion as treated above include more than sixty-four per cent of all the 
substitutions that have been tabulated. It may be that many of the 
remaining cases represent combinations of the types here described. It is 
possible, however, that others will not be understood at all until some 
account is taken of factors in the learning situation which have not been 
mentioned ‘here, e.g., the influence of the “phonetic equivalent” (4) and 
of the signal sequence employed in training, as well as influences which are 
at present unrecognized or inadequately appreciated. 

In connection with the question raised earlier concerning the stimulus 
characteristics responsible for signal generalization, a few comments are 
now in order. 

(1) There is a relation between the number of signal components and 
the difficulty of the signal, at least to the extent that signals with fewer 
components generalize less than signals with many components. One 
qualification of this statement must be made. A one- or two-component 
signal (e.g., A or E) is never difficult, but a larger signal may be very easy 
(e.g., @ or 9). The digits constitute a unique group because of their 
graded five-component structure. 

(2) If a signal’s absolute duration is computed on the basis of the 
conventional assignment of one time unit to a dot, three units to a dash, 
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and one unit to the space between components, a relation similar to the 
above appears: short signals generalize less readily than long signals. 
The same qualification applies here as in the case of the relation of signal 
components to difficulty, and the two cases are obviously interdependent. 

(3) Neither of the foregoing stimulus properties serve to differentiate 
adequately the four character groups mentioned earlier. Professor B. F. 
Skinner* has suggested another property, that of ‘“‘double-change’”’ or 
“change-and-change-again” which might achieve this end. Double- 
change refers to the transition, within a signal, from dot to dash to dot or 
dash to dot to dash. Group I of the signals contains five cases of this 
sort: P (---—-), F (----), Y (----), Q (----), and L (----). Group II 
contains three cases: C(—-—-), X (—- -—), and K (---). Group III has but 
one case: R (-—-); and Group IV has none. Although but nine of the 
thirty-six signals are involved in this differentiation of groups, they never- 
theless account for about thirty per cent of the total substitution errors of 
Table 3. 

(4) The existence of two types of signal component, dots and dashes, is 
obviously the basis of most of the errors in code learning—e.g., errors of 
inversion, reversal, continuation, and partial inversion. In a code com- 
posed of but one type of component, only errors of under- or overestima- 
tion could occur. That these components are sufficiently distinct to 
constitute two “continua” may be seen from the fact that pure dot signals 
(E, I, 8, H, and 5) generalize more with each other than with pure dash 
signals (T, M, O, and ¢) or with mixed-component signals. The converse 
is true of the dash signals, but to a lesser degree since the dashes may be 
counted by the code student with relative ease. With mixed-component 
signals, except in those cases of relatively long dot or dash sequence 
(6, B, 1, J, ete.), other error types predominate. 

(5) It is true, however, that the dot and dash belong to the same 
fundamental continuum of time, and, psychophysically, both absolute 
and relative judgments of duration are possible. With a mixed-com- 
ponent signal, relative judgments are inescapable; with pure dot or dash 


(-) are not infrequently generalized. This fact may be used to explain 
the inversion errors for pure dot and pure dash signals. 

The foregoing analysis, however subject to revision, lends further 
support to certain practical suggestions made in the second paper of this 
series (3). For one thing, there is reason to believe, even more than 
formerly, that some types of error may be highly prognostic of an indi- 


*In a personal communication. The writers are also greatly indebted to Professor 
Skinner for suggesting some of the main features of the present analysis. 
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vidual’s ability to master code and advance to high-speed reception, at 
least within the period of time ordinarily prescribed in military training 
centers. The writers feel that this is particularly true of the dotting error 
which, despite all remedial effort, may stubbornly resist elimination. 
Many observations made in the Columbia laboratory on small groups of 
students, working at various levels of code reception, show that difficulty 
of this sort may be revealed very early in training, and quite generally 
forecasts unsatisfactory progress. It has thus far, however, been impossi- 
ble to secure error data from large groups of students, or students operat- 
ing at speeds above twenty words per minute. 

In this connection, the broad question of economy, broached earlier in 
this paper, may again be raised. Does the present analysis point to the 
need for any change in International Morse Code which would increase, 
its psychological efficiency as a medium of communication? A dogmatic 
assertion cannot be made, but, lacking the crucial experiment (which 
could easily be designed), the present writers are of the opinion 
that the appropriate replacement of three signals—H (----), B (—---), 
and V (---—)—which would almost entirely eliminate the dotting error, 
would bring code skill within the grasp of a far greater number of beginners 
in a much shorter period of time. Whether it would make easier the task 
of trained operators, especially in their handling of cipher material, 
wherein the signal context is of least importance in determining the re- 
sponse, cannot be stated until appropriate studies are made. 


Received October 14, 1943. 
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Personality Test Scores and Success in the Field of Nursing 


George K. Bennett and H. Phoebe Gordon 
Psychological Corporation, New York City 


A study of the adequacy of any personality test in relation to voca- 
tional success includes two phases: (1) the extent to which the test identi- 
fies traits recognizable by others and (2) the extent to which these traits, 
as identified, are assets or liabilities in a given occupation. Reviews of the 
recent studies of personality tests indicate greater emphasis on the relia- 
bility of the measures than on the validity, and very little seems to have 
been done on the relation of test scores to success in various occupations 
(1, 6, 8). 

The occupation of nursing is one of the major fields in which personal- 
ity has generally been accepted as an important factor in success. This 
importance is evidenced by such analyses as made by Miles (7). In view 
of the continuing demand for nurses, it seems desirable to study the 
validity of any available test data for its ability to identify recognizable 
traits that would contribute toward success in the field. The authors, 
therefore, undertook such a study of test scores yielded by the Bernreuter 
Personality Inventory and the Minnesota Personality Seale, for a group 
of nurses. The fact has to be faced that when statistical investigations 
are made on the validity of such tests, the results consistently appear 
disappointing (3, 4, 5). 

Before entering upon the major part of this investigation the authors 
made an attempt to determine the stability of scores made by nursing 
school applicants on the Bernreuter Personality Inventory. In 1941, a 
retest was made of 120 student nurses who had previously compieted the 
Personality Inventory as part of a battery of tests required of all candi- 
dates. At the time of the second testing,' these subjects had attended 
nursing schools for approximately six months and had survived the pro- 
bationary period. The findings of this investigation are given in Table 1. 

It will be observed that the correlations are appreciably lower over a 
period of six months than are the reliability coefficients inferred from the 
split-half method. Some shrinkage of reliability coefficients over a period 
of this length is to be expected as due to actual intervening changes in the 
personality structure of the subjects. The fact, however, that no correla- 


1 It was explained to the subjects that the second test was for the purpose of obtain- 
ing research data and that the scores would not be reported to the school. 


267 








Sam eee Ree 











268 George K. Bennett and H. Phoebe Gordon 
Table 1 

Ist Test 2nd Test Test-Retest Split-Half** 
Trait * Mean Score Mean Score Correlation Correlation 
B1-N (Neuroticism) — 142.16 — 120.75 .73 91 
B2-S (Self-sufficiency) 42.00 32.62 74 .92 
B3-I (Introversion) — 84.67 —69.17 67 89 
B4-D (Dominance) 86.87 75.50 81 89 





* Persons scoring high (positive) on the identified traits are said to be more toward 
the end of the trait scale as named. 


** The split-half reliabilities are those reported in the Manual for the Personality 
Inventory. 


tion was higher than .81 suggests that for individual diagnosis only limited 
confidence should be placed in the Personality Inventory scores. The 
reliabilities obtained over this period of time may be sufficiently high for 
general group predictions, if there were any cogent evidence of validity 
in the scores. 

Also to be noted in Table 1 is the fact that the mean scores have shifted 
from the first to the second administration of the test. All the shifts are 
in the “undesirable” direction: toward greater neurotic tendencies, less 
self-sufficiency, greater introversion, and greater submissiveness. If these 
students answered without falsification upon the second administration of 
the test, it is possible to conclude either that they felt less secure and stable 
after surviving a probationary period or that, at the time of the first 
examination, they were attempting to answer the questions in a way they 
thought would be most acceptable to the school to which they were apply- 
ing. The latter mentioned possibility appears more reasonable, although 
either causal condition renders the validity of the test scores suspect. 

The findings so far presented indicate first, that the Personality In- 
ventory is not a valid predictor of academic success in general, and second 
that the scores obtained appear to be modified by the circumstances 
under which the respondent completes the test. These findings are sup- 
ported by the results of other investigations. Even though this type of 
test presents these defects, it might still be an instrument of some value if 
it could be shown that scores of the sort produced by such inventories 
were predictive of the judgments subsequently made by close associates 
regarding aspects of the personality of student nurses. In order to inves- 
tigate the possibility of such predictions the following studies were 
undertaken. : 


Procedure and Results of Major Study 


Test scores were available for several hundred applicants for various 
schools of nursing. These tests had been taken as a part of a preliminary 
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battery given at the time the student was an applicant for admission to 
the school. The scores used in the present study were obtained from the 
Bernreuter Personality Inventory, Minnesota Personality Scale, and 
Modified Alpha Examination Form 8. 

Rating sheets were constructed by the authors. The form of the 
rating sheets used is shown in Figure 1. Part A of Figure 1 is self- 
explanatory and will be referred to in the remainder of this report as the 
“Estimate of Personality Value.’ Directions which accompanied the 
rating sheets emphasized the fact that the rating scale was to indicate the 
apparent contribution made by the personality to success and was not 
necessarily a measure of success. Thus a person who was actually a suc- 
cess in her vocation might be marked at the lower end of the scale if her 
personality seemed a handicap rather than an asset for her in achieving 
her success. Part B of Figure 1 is a check list on which the rater was 
instructed to check statements that were applicable to the person being 
rated. Each of the statements used was intended to correspond to one of 
the traits in the two inventories being studied. This method is somewhat 
similar to that used by Drake, Roslow, and Bennett (2). So far as possi- 
ble the statements were taken from the published definitions of the traits 
as given by the authors of the personality tests used in this study. For 
example, if the student showed extreme lack of the quality identified 
with Morale in Part I of the Minnesota Personality Scale, the rater might 
be expected to check statement number 25: ‘“‘Tends to feel that the world 
is against her.’”’ Possession of a moderate amount of the quality might be 
expected to result in the rater checking statement 16: “(Has a wholesome 
acceptance of the world in which she lives.”” Possession of an extreme 
amount of the quality might be expected to result in the rater checking the 
statement: “Too uncritical in her acceptance of every situation.” 

These rating sheets were sent to schools of nursing to which the tested 
students had been admitted. A fairly large number of students had taken 
the tests who were not admitted to any school of nursing, and some others 
were admitted to schools from which the ratings could not be obtained. 
The total number of students for whom all the desirable data could be 
obtained was 235.? 

For each of these students rating sheets were filled out by as many 
supervisors in the school as were acquainted with the student (usually 
five in number, including the director of the school, nursing arts instructor, 
and three head nurses) and by all of the student’s classmates. The 
median number of classmates rating any one student was 15. The total 
number of completed rating forms returned from the schools was 4727, 


* These students were in 25 different schools, chiefly in New York State, Massa- 
chusetts, and New Jersey. 
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A. How does Miss ’s personality seem apt to affect her 
success in the field of nursing? (Please indicate your answer by a check 
mark on the following rating scale.) 


, 2. 3. 4. ‘ 
Isa marked Is rather a Is neither a Is, on the Is an out- 
obstacle in hindrance to help nora whole, help- standing asset 
the way of her. hindrance to _ful to her. for her. 
her success. her. 


B. On the list below please check any qualities which you consider PARTICU- 
LARLY SIGNIFICANT IN PLACING THIS STUDENT AS YOU DO 
ON THE ABOVE RATING SCALE. If other qualities seem to have 
more bearing on her success in nursing than those listed please add the 
names or descriptions of such qualities at the end of the list. So far as 
possible select only qualities which seem most significant in relation to this 
student’s success in nursing. The number of check marks you make for 
each student will probably range from four to ten. 

. Unable to adapt herself to working in group activities. 

Assumes responsibility well. 

. Spends very little time thinking about herself. 
= to sense the emotional implications which a situation may have for 

others. 

Enjoys making new friends. 

. Too retiring or timid. 

. Too uncritical in her acceptance of every situation. 

. Not easily disturbed emotionally by situations or people. 

. Overdependent on parents approval. 

. Has no tendency toward radicalism in economic or political policies. 

11. Too easily upset emotionally. 

12. Finds it difficult to carry an independent activity through to completion. 

13. Well balanced emotionally. 

14. Is conservative but not narrow in her attitude toward economic problems. 

15. Resents all new ideas in economic problems. 

16. Has a wholesome acceptance of the world in which she lives. 

17. Acts too impulsively. 

18. Is resentful of what she considers unfriendly family relationships. 

19. Has social poise. 

20. Completes assignments with minimum of encouragement from others. 

21. Enjoys assuming leadership. 

22. Tends to brood over situations. 

23. Confident of opportunities in the world for her future development. 

24. Over aggressive. 

25. Tends to feel that the world is against her. 

26. Withdrawn or shy in her contacts with others. 

27. Tends to prefer activity to dreaming. 

28. Too radical in ideas on political and economic problems. 

29. Has friendly family relationships. 

30. Has initiative. 

31. Too anxious for constant group activities. 

32. Is in wholesome accord with parent standards and wishes. 

33. Is not sensitive to other people’s feelings. 

34. So independent she cannot cooperate with others. 

35. Can lead when situation requires. 

36. Unhappy or restless when alone. 


a 
SODNOM POE 


Fig. 1. Rating Sheet. 
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of which 937 were from supervisors and 3790 from classmates. For each 


person studied, a Hollerith card was prepared which contained the follow- 
ing data: * 


1. Mean of Supervisors’ Estimates of Personality Value. 

2. Mean of Classmates’ Estimates of Personality Value. 

3. Number of times each statement on the check lists was checked by 
supervisors and by classmates. 

4. Decile equivalent of the scores on the five parts of the Minnesota 
Personality Scale and on each of the four parts of the Bernreuter 
Inventory. 

5. Decile equivalent of Alpha score. 

6. Number of classmates rating the student. 

7. Number of supervisors rating the student. 


The check list of Figure 1 was first analyzed as shown in Tables 2 and 3. 
As an illustration of the method used refer to the first line of Table 2. 
The number 4 in the first column indicates that the statement was number 
4 on the check list of Figure 1. This statement is derived from the Bern- 
reuter manual definition of persons scoring low on the trait identified as 
Neurotic Tendencies (B1-N), as indicated by the third column of the 
table. Of our 235 cases, 70 constitute approximately 30 per cent with 
the lowest scores on this scale. From these 70 girls there were a total of 
322 rating sheets completed by supervisors. Statement 4 was checked 35 
times out of 322 opportunities, or in 11 per cent of the supervisors’ ratings, 
as shown by column four of Table 2. Similarly, for the 95 girls that make 
up the middle 40 per cent of the group on the neurotic tendencies scores, 
there were 400 rating sheets completed by supervisors. On 15 per cent or 
61 of the rating sheets, item number 4 was checked, as shown by column 
five of the table. For the 70 girls whose scores were in the highest 30 
per cent on neurotic tendencies there were 323 rating sheets completed 
by supervisors of which 46 or 14 per cent contained checks for item 4, as 
in column six of the table. The same method was used in computing the 
percentage scores for the classmates’ ratings. 

It is assumed that if the interpretation of the traits in the personality 
tests can be made in accordance with the manuals accompanying the tests, 
we should expect to find significant relationships between the test scores 
and the checking of the statements by independent judges. Using the 
first statement in Table 2 as an illustration, since it is intended to indicate 
persons low in neurotic tendencies we might expect it to be checked much 
more frequently for students whose scores were low in “‘B1-N”’ than for 


* The cards were prepared and the material tabulated by the statistical division of 
the Graduate Records Examination Bureau. 
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Table 2 
Per cent of Persons having scores in the 
Se- indicated area for whom the 
rial statement was checked by: 
Num- 
‘ on Corresponding 
Check Scale and Low Middle High Low Middle High 
List Statement Placement 30% 40% 2% 30% 40% 30% 
4 Unable to sense the emotional B1-N 11 15 14 ) 13 13 
} implications which a situa- Low 
; tion may have for others. 
33 Is not sensitive to other B1-N 8 9 7 10 13 9 
people's feelings. Low 
13 Well balanced emotionally B1-N 53 47 53 51 47 47 
Moderate 
8 Not easily disturbed emotion- B1-N 49 43 53 34 33 36 
ally by situations or people. Moderate 
ll Too easily upset emotionally. BI1-N 9 10 7 12 14 17 
High 
12 Finds it difficult to carry an B2-8 15 1l g 8 7 5 
; independent activity through Low 
to completion. 
} 36 Unhappy or restless when B2-S 5 4 1 o 7 5 
alone. Low 
| 2 Assumes responsibility well. B2-8 eae a 
Moderate 
| 20 Completes assignments with § B2-8 37 41 32 33——C—té«Sk 
} minimum of encouragement Moderate 
from others. 
1 Unable to adapt herself to B2-8 4 7 5 10 1l i) 
working in group activities. High 
34 So independent she cannot co- B2-8 2 3 5 5 8 6 
operate with others. High 
17 Acts too impulsively. B3-1 9 12 8 1l 16 ll 
} Low 
3 Spends very little time think- B3-1 28 20 25 24 21 22 
; ing about herself. Moderate 
j 27 Tends to prefer activity to B3-1 33 27 27 25 24 
' dreaming. Moderate 
i 22 Tends to brood over situations. B3-1 7 8 ll 10 12 15 
' High 
i 6 Too retiring or timid. B4-D 17 ll 10 14 9 7 
Low 
21 Enjoys assuming leadership. B4-D 21 27 29 26 30 35 
Moderate 
; 30 Has initiative. B4-D 39 42 47 44 47 48 
Moderate 
35 Can lead when situation B4-D 40 47 48 45 48 50 
requires. Moderate 
24 Over aggressive B4-D 4 6 6 5 10 y 
High 
25 Tends to feel that the world is M1-Morale 4 4 2 5 4 3 
against her. Low 
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Table 2—Continued 





Statement 


Corresponding 
Scale and 
Placement 


Per cent of Persons having scores in the 
indicated area for whom the 
statement was checked by: 





Supervisors Classmates 





Low Middle High Low Middle High 
30% 40% 30% 30% 40% 30% 





Has wholesome acceptance of 
the world in which she lives. 
Confident of the opportunities 
in the world for her future de- 
velopment. 

Too uncritical in her accept- 
ance of every situation. 
Withdrawn or shy in her 
contacts with others. 


Enjoys making new friends 
Has social poise. 


Too anxious for constant 
group activities. 

Is resentful of what she con- 
siders unfriendly family re- 
lationships. 

Has friendly family 
relationships. 

Is in wholesome accord with 
parent standards and wishes. 
Over dependent on parents’ 
approval 

Tends to brood over 
situations. 

Spends very little time think- 
ing about herself. 

Tends to prefer activity to 
dreaming. 

Acts too impulsively. 


Too radical in ideas on politi- 
cal and economic problems. 
Has no tendency toward radi- 
calism in economic or political 
policies. 

Is conservative but not nar- 
row in her attitude toward 
economic problems. 

Resents all new ideas in 
economic problems. 


M1-Morale 
Moderate 
M1-Morale 
Moderate 


M1-Morale 
High 
M2-Soc. Adj. 
Low 

M2-Soc. Adj. 
Moderate 
M2-Soc. Adj. 
Moderate 
M2-Soc. Adj. 
High 
M3-Fam. Rel. 
Low 


M3-Fam. Rel. 
Moderate 


M3-Fam. Rel. 
Moderate 
M3-Fam. Rel. 
High 


M4-Emo. Stab. 


Low 


M4-Emo. Stab. 


Moderate 


M4-Emo. Stab. 


Moderate 


M4-Emo. Stab. 


High 
M5-Ec. Cons. 
Low 


M5-Ec. Cons. 
Moderate 


M5-Ec. Cons. 
Moderate 


M5-Ec. Cons. 
High 


44 49 50 38 45 46 


23 25 27 25 27 30 





students whose scores were high on this trait. 
in Table 2 that this is not what has happened. From scrutiny of the 


Actually, it will be noticed 
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Table 4 
Supervisors’ Classmates’ 
Alpha Score Estimates Estimates 
Highest 3 Deciles 3.57 3.57 
Middle 3 Deciles 3.52 3.68 
Lowest 3 Deciles 3.51 3.63 





percentages reported in Tables 2 and 3 there seem to be very few instances 
in which percentages or percentage trends correspond to expectancies. 
Item 12 in Table 2 is one instance in which there seems to be some cor- 
respondence; that is, the persons making low scores in Self-sufficiency 
seem to be regarded more often as finding it “difficult to carry through an 
independent activity” than do the persons who scored high in that scale. 
This finding occurs both for the supervisors’ check list and that made by 
the classmates. However, this correspondence is not sufficiently marked 
to make the test score a very valid predictor of the estimates of associates. 
There are a few other items in the tables in which there seems to be a 
slight correspondence, but in general we must conclude either that the 
statements are not descriptive of the traits being measured by the 
various scales or that the individuals’ self-ratings in the form of test 
scores for these traits do not agree with those of their associates. 

The next phase of the problem investigated was the relation of the 
scores to the Estimates of Personality Value, Part A of Figure 1. For 
instance, even though the scores were not identified with the expected 
behavior pattern they might still be measures of characteristics which, 
though difficult to define, are contributory to success in the vocation of 
nursing. In order to check on the reliability of the estimates of personal- 
ity value the classmates’ estimates were divided into equal number groups. 
The correlation of these estimates was found to be .88 when corrected by 
the Spearman-Brown Prophecy Formula for the total number of raters. 
This correlation is high enough to indicate that the mean rating of each 
individual as to personality value is a reasonably accurate measure of the 


Table 5 
Correlations Between Ratings and Specified Variables of the Minnesota Personality Scale 








Supervisors’ Classmates’ 
Variable Mean Rating Mean Rating 
1. Morale _ 087 .104 
2. Social Adjustment 242 .090 
3. Family Relationships 182 113 
4. Emotionality 136 .068 
5. Economic Conservatism .000 .022 
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degree to which the individual’s personality is estimated by her asso- 
ciates. . This fact is further corroborated by a cuneation of .59 between 
supervisors’ and classmates’ ratings. 

In order to determine the extent to which intelligence of the persons 
studied might be influencing the Estimate of Personality Values, the data 
were analyzed as shown in Table 4. 

Since there is almost no difference between the means of the Estimates 
of Personality Values for those with the highest Alpha scores and for those 


Table 6 


Comparison of Mean Estimates of Personality Value with Self-ratings 
(Degree to which the trait indicated by test scores is thought to 
contribute to success in nursing) 





Score 
placement 


of lng Stan 
Differ- Error of Personality Differ- Error of 
Test Conntae ence** Difference Value ence** Difference 


Bernreuter 1N Lowest 30% J —.24 12 3.72 —.19 10 
(Sensitivity) Middle 40% . 3.63 
Highest 30% y 3.53 


Bernreuter 2S Lowest 30% . a J 3.65 
(Self-Sufficiency) Middle 40% . 3.57 
Highest 30% . 3.68 


Bernreuter 31 Lowest 30% J J 3.73 
(Introversion) Middle 40% ‘ 3.62 
Highest 30% . 3.54 


Bernreuter 4D Lowest 30% ¥ J f 3.59 
(Dominance) Middle 40% . 3.63 
Highest 30% a 3.67 








* For instance: The 70 students whose scores on Bernreuter 1-N place them in the 
lowest 30% of the group at an average estimate of personality value obtained from 
supervisors’ estimates of 3.69. The 70 students whose scores place them in the highest 
30% of the total group had an average estimate of personality value of 3.45. The 
difference between these estimates is .24 with a standard error of .12. 

** This figure is obtained by subtracting the mean of estimates for the low scoring 
group from the mean of estimates for the high scoring group. 


with the lowest, we can assume that intelligence does not have an impor- 
tant influence on the rating. 

In so far as we may assume that Estimates of Personality Values may 
be accepted as at least a moderately good criterion of the possession of 
desirable traits contributing to the success of nursing, the authors an- 
alyzed the material presented in Tables 6 and 7 to study the relationships 
between these Estimates of Personality Values and the self-estimates 
made by the students themselves in their responses on the personality 
tests used. 

The data were further analyzed by determining the correlations be- 
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tween the average classmates’ ratings and each of the five scores on the 
Minnesota Personality Scale and also the correlations between the average 
supervisors’ ratings and each of the five scores on the same inventory for 
the same student nurses.‘ The resulting coefficients of correlation are 
presented in Table 5. 

Referring to Tables 6 and 7 it will be noticed that the only section in 
which the students who scored low in a given part of a personality test 


Table 7 


Comparison of Mean Estimates of Personality Value with Self-rating 
(Degree to which the trait indicated by test scores is thought to 
contribute to success in nursing) 








Score x x’ of Mean - 
placemen upervisors Classma‘ 
of groups Estimate of Standard Estimate of Standard 
i Personality Differ- Error of Personality Differ- Error of 
Test Compared* Value ence*™* Difference Value ence** Difference 

Minnesota Part I Lowest 30% 3.42 .24 12 3.52 18 10 
(Morale) Middle 40% 3.60 3.67 
Highest 30% 3.66 3.70 

Minnesota Part II Lowest 30% 3.31 47 12 3.51 .20 10 
(Social Adjust- Middle 40% 3.57 3.64 
ment) Highest 30% 3.78 3.71 

Minnesota Part III Lowest 30% 3.41 21 13 3.54 .06 ll 
(Family Middle 40% 3.64 3.70 
Relations) Highest 30% 3.62 3.60 

Minnesota Part IV Lowest 30% 3.38 24 15 3.51 18 12 
(Emotional Middle 40% 3.61 3.65 
Stability) Highest 30% 3.62 3.69 

Minnesota Part V Lowest 30% 3.58 —.01 13 3.63 —.05 ll 
(Economic Middle 40% 3.55 3.68 
Conservatism) Highest 30% 3.57 3.58 





* For instance: The 70 students whose scores on Minnesota Part I place them in the 
jowest 30% of the group at an average estimate of personality value obtained from super- 
visors’ estimates of 3.42. The 70 students whose scores place them in the highest 30% 
of the total group had an average estimate of personality value of 3.66. The difference 
between these estimates is .24 with a standard error of .12. 

** This figure is obtained by subtracting the mean of estimates for the low scoring 
group from the mean of estimates for the high scoring group. 


had significantly lower personality ratings than those who scored high in 
the test is in the case of Minnesota Part II (Social Adjustment). This is 
corroborated by the correlation of .242 found in Table 5. It is also evi- 
dent that there is a possibility of a significant difference in the Estimates 


* Correlations were not obtained in a similar manner for the scores on the Bernreuter 
test because records were kept of decile scores on this scale and so could not be correlated 
directly. However, from the data in Tables 6 and 7 equally low correlations can be 
inferred. 
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grouped by scores for Minnesota I (Morale) and for Bernreuter IN and 
31 (Neurotic tendencies and Introversion). The other scores give no 
evidence of a significant difference. The correlations in Table 5 hold 
little promise for important relationships between the divisions of the 
Minnesota test and ratings made by associates or supervisors, with the 
exception of the social adjustments correlated with ratings by supervisors. 
The authors can only conclude, therefore, that only in the four cases 
mentioned is there any indication that a score in either of the tests can be 
used to indicate possession of a trait desirable for nursing.® 


Summary and Conclusions 


In attempting to determine the relationships between measures on 
standard personality questionnaires and individual success in the field of 
nursing, several preliminary studies were made. The stability of per- 
sonality test scores was measured over a period of six months for 120 stu- 
dent nurses. Rating scales and check lists of traits were constructed to 
measure qualities similar to those which two standardized questionnaires 
purport to measure. The check lists were submitted to supervisors and 
colleagues of 235 student nurses from different schools, and comparisons 
were made between scores on the tests and ratings on the check lists. 

From an analysis of the data resulting from the described studies the 
authors feel the following conclusions to be warranted. 

1. Scores on the Bernreuter Personality Inventory are relatively stable 
over a period of months but are not sufficiently consistent to be used for 
individual prediction. This conclusion must be qualified to apply to 
students in nursing schools, and the instabilities in the test scores may be 
the result of changes in the individuals due to the training they undergo. 
However, it is to such a population that we are applying the conclusions 
from these studies. 

2. In the case of students who are tested as a part of a selection process 
there is an apparent tendency to modify their responses in a favorable 
direction. This tendency in itself may conceivably be one of the causes 
of the generally poor validity of paper and pencil personality inventories. 

3. There is a marked lack of agreement between test scores on the 
personality questionnaires employed and ratings by supervisors and col- 
leagues. The significance of this finding is easily seen in that it is the 
judgment of the nursing supervisors which is the greatest factor deter- 
mining the student’s success while in training. To some extent the lack 
of agreement found might be due to the real lack of similarity between the 
traits compared. However, the validity of the ratings and check lists is 


5 It is really only to three cases that the conclusion applies, since the two scores on 
the Bernreuter test correlate so highly (.95) as to be measuring the same factor. 





f 
r 
i 
ii 
t 
, 
i 











278 George K. Bennett and H. Phoebe Gordon 


no more questionable than the traits identified by the standardized per- 
sonality tests. 

4. The intelligence of the student nurse is independent of the per- 
sonality value ratings given her by classmates and supervisors. 

5. When tested against a rating scale of the degree to which the per- 
sonality of an individual contributed to her success in training, the per- 
sonality tests used demonstrate an almost negligible power of prediction. 
In itself this is a very important result, since the personality tests are often 
used as indicators of personality traits which will supposedly aid the 
individual in making adjustments to her work in training and in her career. 

6. To the extent that it is possible to generalize from the findings pre- 
sented in the present study, it would appear that the type of personality 
test used is of little or no value as a part of a battery of tests used in 
personnel selection, since it will predict neither success nor the attitudes 
of colleagues or supervisors. 


Received April 26, 1944. 
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Book Reviews 


Cardall, A. J. A wartime guidance program for your school. Chicago: 
Science Research Associates, 1943. pp. 104. 


Most of this book deals with matters of interest primarily to school 
counselors and to teachers interested in vocational guidance. But it 
concerns readers of this journal for two reasons: (1) it is built around a 
psychological technique, upon the soundness of the application of which 
psychologists are called upon to pass, and (2) it is one of an increasing 
number of such publications receiving wide distribution by Science Re- 
search Associates. Many school administrators, teachers, and parents 
will judge the effectiveness of applied psychology by the results attained 
by teachers who use the procedures and techniques outlined in these 
publications. 

This booklet is based upon the fact that most of the guidance work 
that is done in our schools is done by relatively untrained persons. Its 
author goes on the assumption that a guidance procedure of ‘‘mechanistic 
form is essential in aiding school personnel with limited training for guid- 
ance to do a better job.” 

Cardall’s aim is to make available to teacher-counselors a technique 
for individual analysis which leaves little to the teacher’s knowledge of 
measurement or familiarity with occupational abilities. He therefore 
recommends two or three selected tests for each of seven types of abilities 
or traits. These same traits are used in rating a large number of wartime 
jobs, both civilian and military. The “mechanistic” procedure recom- 
mended is that one or more tests for each of these types of traits be given 
to each student, and that test profiles then be matched with the ability 
profiles made by Cardall by means of rating procedures. The implication 
is that it is better to devote one’s energies to putting mechanical devices 
into the hands of persons who do not understand their limitations, than 
it is to spend the same amount of time giving these same people more 
insight into the problems involved. Most psychologists have shied away 
from putting tools into the hands of untrained persons; Cardall, however, 
urges that this is better than having untrained persons use no tools at all. 

Granting (for the sake of argument only) that Cardall’s position is 
sound, there is still a question concerning the validity of the instruments. 
Are these tools really better, in the hands of untrained workers, than no 
tools at all? 

The tests discussed are among the more respectable psychological 
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tests. That the seven highly recommended tests are published by Science 
Research Associates, Cardall’s employers and publishers, and that six of 
the twelve “‘also recommended” tests are handled by the same publishers, 
is unfortunate, for it naturally suggests a possible pecuniary motive for 
advocating putting these tests in the hands of large numbers of untrained 
persons. In the hands of trained persons, and supplemented by occupa- 
tional norms, the only cause for misgivings would be the limited number of 
traits and abilities tested. When used by untrained persons, and espe- 
cially for the evaluation of rated traits whose relationship to the tested 
traits has not been uniformly established, their value is questionable. 

The method of making the occupational ratings is not explained in 
detail. We are told only that they are based on “pooled ratings of re- 
quirements by vocational psychologists”; how many psychologists, and 
how familiar they were with occupational abilities research, is not stated. 
Neither is there any indication of the extent of agreement among the 
raters. Cardall has presumably copied, superficially, the Minnesota Oc- 
cupational Rating Scales, but as many of the jobs rated are not included 
in the previously published studies and as there is little overlapping of the 
traits in the two scales, there must have been a greater reliance on personal 
opinion than on research data. The viewpoint of the Minnesota Scale 
authors concerning these techniques is, incidentally, diametrically op- 
posed to Cardall’s. They state: “In no sense are these techniques to be 
used as devices in mechanical counseling . . . the amateur will obtain 
little aid from them.” 

As this reviewer’s full-time assignment has for nearly two years con- 
sisted of job and man analysis and test construction for the selection and 
classification of Army Air Forces pilots, navigators, and bombardiers, he 
was naturally interested in the requirements for these jobs as reported by 
Cardall. Typical defects in these data follow. Although spatial judg- 
ment is correctly rated as important in pilots, this is to be measured by the 
Minnesota Paper Form Board, whereas AAF experience has shown that 
the types of spatial judgment needed in flying are much more complex 
than that measured by this test. Most serious error of all, perhaps, is the 
failure to rate pilots on coordination and certain other motor and per- 
ceptual skills, not provided for by such a general rating scale. The list 
of traits used is too brief to be very helpful with specific jobs, and the best 
hunches of vocational psychologists are too invalid for unvalidated use in 
developing job specifications. ; 

A few other unqualified recommendations of questionable psychologi- 
cal procedures might be mentioned. The use of linguistic and quantita- 
tive intelligence tests is recommended, with no mention of the caution 
needed in using part scores, the differential predictive value of which has 
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been disproved as frequently as it has been proved. Cardall’s test of 
Practical Judgment is strongly recommended, and jobs are rated for that 
trait, although this reviewer has yet to see validation data for a judgment 
test. The use of lay counselors by schools in order to supplement their 
facilities is urged, without mention of the serious pitfalls encountered in 
such programs nor of the safeguards which can be taken to avoid some of 
them. A fifteen minute interview is recommended for the discussion of 
students’ plans in the light of test and other data! 

Science Research Associates have previously rendered some very real 
services to vocational psychologists and counselors in the publication of 
occupational information pamphlets, tests, and testing literature. It is 
to be hoped that the current wedding of guidance literature to marketing 
methods will not result in the vitiation of psychological procedures and the 
discrediting of our profession. 

Donatp E. Super, 


Captain, Air Corps 
Psychological Research Unit No. 1, 
Nashville Army Air Center 


Carlé, Charles. Mysticism in modern psychology. New York: Psycho- 

Sociological Press, 1943. Pp. 47. $1.00. 

In his little booklet Charles Carlé, Ph.D., objects to three kinds of 
“‘secularized mysticism’’: Freudian psychoanalysis, the Rorschach test as 
currently used by most testers, and H. A. Murray’s Harvard study em- 
bodied in Explorations in Personality. Dr. Carlé does not object in the 
tempered, judicial fashion supposedly characteristic of the scientist, but 
with a vehemence that runs into violence. He says that H. E. Burtt, in 
his new edition of Principles of Employment Psychology, “seems to be 
fighting with blinkers on both eyes” in attacking only “the weakest and 
most harmless”’ of the “‘gold bricks” instead of Freudianism and Rorschach 
testing; and that Murray’s study is ‘‘on the quack level,’’ as weil as being 
“one of the most valuable contributions to applied psychology,” because, 
like applied psychology in general, the Harvard experiment “reveals the 
same jumble of unrelated tests, based on controversial philosophies—if 
upon any at all.” 

Freud is the cause of the greater part of Carlé’s explosion. According 
to the latter, Freudian psychoanalysis was the product of the fear and 
anxiety occasioned by the time and place in which its originator lived. 
Freud’s escape was by creating “‘the religion of the unconscious.”” Carlé 
deprecates the fact that practicing psychoanalysts, once legally author- 
ized, are under no further supervision. They may become psychiatric 
cases themselves and give irresponsible advice. Remembering just such 
a situation, the reviewer agrees with the author. 
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Carlé’s language is as picturesque as it is dogmatic. ‘The ‘uncon- 
scious,’ the ‘Id,’ is nothing more than a revival of the devil incarnate,” 
and psychoanalysis is “‘religion-ersatz without moral code”’; the Rorschach 
testers are “‘selling the Brooklyn Bridge’”’ in monopolizing the method; 
and Murray’s subjects are the “victims of mass assault” in being subjected 
to exhaustive tests built upon a philosophy concocted “from a little bit 
of everything on the ideological shelves.” 

The author’s solution lies in a “‘nonmystical, holistic concept of human 
nature,’ “‘a psycho-sociological product,” based on a philosophy of the 
Gestalt variety, which is to reach out to national affairs. Details as to 
application are lacking, and herein lies a chief weakness of the booklet. 
But it is hinted that right diagnosis of personality will come through the 
author’s revision of Rorschach, now in preparation. 

Dorotuy HazELTINE YATES 


San Jose State College 
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Books, monographs, and pamphlets for listing and possible review should be 
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